The Fight Against AI Comes to a Foundational Data Set

WIRED 

Danish media outlets have demanded that the nonprofit web archive Common Crawl remove copies of their articles from past data sets and stop crawling their websites immediately. Common Crawl plans to comply with the request, first issued on Monday. Executive director Rich Skrenta says the organization is "not equipped" to fight media companies and publishers in court. It made the request on behalf of four media outlets, including Berlingske Media and the daily newspaper Jyllands-Posten. The New York Times made a similar request of Common Crawl last year, prior to filing a lawsuit against OpenAI for using its work without permission.