Revealed: The Authors Whose Pirated Books Are Powering Generative AI

Aug-19-2023, 21:04:56 GMT–The Atlantic - Technology

One of the most troubling issues around generative AI is simple: It's being made in secret. To produce humanlike answers to questions, systems such as ChatGPT process huge quantities of written material. But few people outside of companies such as Meta and OpenAI know the full extent of the texts these programs have been trained on. Some training text comes from Wikipedia and other online writing, but high-quality generative AI requires higher-quality input than is usually found on the internet--that is, it requires the kind found in books. But neither the lawsuit itself nor the commentary surrounding it has offered a look under the hood: We have not previously known for certain whether LLaMA was trained on Silverman's, Kadrey's, or Golden's books, or any others, for that matter.

books3, dataset, developer, (16 more...)

The Atlantic - Technology

Aug-19-2023, 21:04:56 GMT

Journals Web Page

Add feedback

Country:
- North America > United States
  - California (0.04)
  - New York > New York County
    - New York City (0.04)
- Europe > United Kingdom
  - England > Oxfordshire > Oxford (0.04)

Industry:
- Law > Intellectual Property & Technology Law (0.95)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found