AI companies are finally being forced to cough up for training data
AI companies have pillaged the internet for training data, and many websites and data set owners have started restricting the ability to scrape their websites. We've also seen a backlash against the AI sector's practice of indiscriminately scraping online data, in the form of users opting out of making their data available for training and lawsuits from artists, writers, and the New York Times, claiming that AI companies have taken their intellectual property without consent or compensation. My colleague James O'Donnell dissects the lawsuits in his story and points out that these lawsuits could determine the future of AI music. But this moment also sets an interesting precedent for all of generative AI development. Thanks to the scarcity of high-quality data and the immense pressure and demand to build even bigger and better models, we're in a rare moment where data owners actually have some leverage.
Jul-2-2024, 08:55:47 GMT