train generative ai
OpenAI admits it's impossible to train generative AI without copyrighted materials
And based on what OpenAI told the House of Lords Communications and Digital Select Committee, we might see more lawsuits against the companies in the future. It added that "[l]imiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today's citizens." In a new post on its blog made in response to the The New York Times' lawsuit, it said the use of publicly available internet materials to train AI falls under fair use doctrine. It admitted, however, that there is "still work to be done to support and empower creators." The company talked about the ways it's allowing publishers to block the GPTBot web crawler from being able to access their websites. It also said that it's developing additional mechanisms allowing rightsholders to opt out of training and that it's engaging with them to find mutually beneficial agreements.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)
What I Found in a Database Meta Uses to Train Generative AI
Editor's note: This article is part of The Atlantic's series on Books3. You can search the database for yourself here, and read about its origins here. This summer, I reported on a data set of more than 191,000 books that were used without permission to train generative-AI systems by Meta, Bloomberg, and others. "Books3," as it's called, was based on a collection of pirated ebooks that includes travel guides, self-published erotic fiction, novels by Stephen King and Margaret Atwood, and a lot more. Books play a crucial role in the training of generative-AI systems.
- Law > Litigation (1.00)
- Law > Intellectual Property & Technology Law (0.78)