Digger: Detecting Copyright Content Mis-usage in Large Language Model Training