chhabria
Judges Don't Know What AI's Book Piracy Means
More than 40 lawsuits have been filed against AI companies since 2022. Late last month, there were rulings on two of these cases, first in a lawsuit against Anthropic and, two days later, in one against Meta. Both of the cases were brought by book authors who alleged that AI companies had trained large language models using authors' work without consent or compensation. In each case, the judges decided that the tech companies were engaged in "fair use" when they trained their models with authors' books. Both judges said that the use of these books was "transformative"--that training an LLM resulted in a fundamentally different product that does not directly compete with those books.
What comes next for AI copyright lawsuits?
On the other side, plaintiffs range from individual artists and authors to large companies like Getty and the New York Times. The outcomes of these cases are set to have an enormous impact on the future of AI. In effect, they will decide whether or not model makers can continue ordering up a free lunch. If not, they will need to start paying for such training data via new kinds of licensing deals--or find new ways to train their models. And that's why last week's wins for the technology companies matter. If you drill into the details, the rulings are less cut-and-dried than they seem at first.
- North America > United States > Tennessee (0.05)
- North America > United States > California (0.05)
- Law > Litigation (1.00)
- Law > Intellectual Property & Technology Law (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.53)
Meta wins AI copyright lawsuit as US judge rules against authors
However, the ruling offered some hope for American creative professionals who argue that training AI models on their work without permission is illegal. "It stands only for the proposition that these plaintiffs made the wrong arguments and failed to develop a record in support of the right one." A Meta spokesperson said the company appreciated the decision and called fair use a "vital legal framework" for building "transformative" AI technology. The authors sued Meta in 2023, arguing the company misused pirated versions of their books to train its AI system Llama without permission or compensation. Get set for the working day – we'll point you to all the business news and analysis you need every morning Chhabria expressed sympathy for that argument during a hearing in May, which he reiterated on Wednesday.
- Law > Litigation (0.44)
- Law > Intellectual Property & Technology Law (0.41)
- Media > News (0.36)
A Judge Says Meta's AI Copyright Case Is About 'the Next Taylor Swift'
US District Court Judge Vince Chhabria spent several hours grilling lawyers from both sides after they each filed motions for partial summary judgment, meaning they want Chhabria to rule on specific issues of the case rather than leaving each one to be decided at trial. The authors allege that Meta illegally used their work to build its generative AI tools, emphasizing that the company pirated their books through "shadow libraries" like LibGen. Kadrey v. Meta is one of the dozens of lawsuits filed against AI companies that are winding through the US legal system. While the authors were heavily focused on the piracy element of the case, Chhabria spoke emphatically about his belief that the big question is whether Meta's AI tools will hurt book sales and otherwise cause the authors to lose money. "If you are dramatically changing, you might even say obliterating, the market for that person's work, and you're saying that you don't even have to pay a license to that person to use their work to create the product that's destroying the market for their work--I just don't understand how that can be fair use," he told Meta lawyer Kannon Shanmugam.
Zuckerberg approved Meta's use of 'pirated' books to train AI models, authors claim
Citing internal Meta communications, the filing claims that the social network company's chief executive backed the use of the LibGen dataset, a vast online archive of books, despite warnings within the company's AI executive team that it is a dataset "we know to be pirated". The internal message says that using a database containing pirated material could weaken the Facebook and Instagram owner's negotiations with regulators, according to the filing. "Media coverage suggesting we have used a dataset we know to be pirated, such as LibGen, may undermine our negotiating position with regulators." The authors sued Meta in 2023, arguing that the social media company misused their books to train Llama, the large language model that powers its chatbots. The Library Genesis, or LibGen, dataset is a "shadow library" that originated in Russia and claims to contain millions of novels, nonfiction books and science magazine articles.
- Europe > Russia (0.26)
- Asia > Russia (0.26)
- North America > United States > New York (0.06)
- North America > United States > California (0.06)
- Law (1.00)
- Information Technology > Services (1.00)
- Media (0.95)
Meta Secretly Trained Its AI on a Notorious Piracy Database, Newly Unredacted Court Docs Reveal
Against the company's wishes, a court unredacted information alleging that Meta used Library Genesis (LibGen), a notorious so-called shadow library of pirated books that originated in Russia, to help train its generative AI language models. Its outcome, along with those of dozens of similar cases working their way through courts in the United States, will determine whether technology companies can legally use creative works to train AI moving forward and could either entrench AI's most powerful players or derail them. Vince Chhabria, a judge for the United States District Court for the Northern District of California, ordered both Meta and the plaintiffs on Wednesday to file full versions of a batch of documents after calling Meta's approach to redacting them "preposterous," adding that, for the most part, "there is not a single thing in those briefs that should be sealed." Chhabria ruled that Meta was not pushing to redact the materials in order to protect its business interests but instead to "avoid negative publicity." The documents were originally filed late last year but remained publicly unavailable until now.
- North America > United States > California (0.26)
- Europe > Russia (0.26)
- Asia > Russia (0.26)
A Machine Learning Approach to Improving Timing Consistency between Global Route and Detailed Route
Chhabria, Vidya A., Jiang, Wenjing, Kahng, Andrew B., Sapatnekar, Sachin S.
Due to the unavailability of routing information in design stages prior to detailed routing (DR), the tasks of timing prediction and optimization pose major challenges. Inaccurate timing prediction wastes design effort, hurts circuit performance, and may lead to design failure. This work focuses on timing prediction after clock tree synthesis and placement legalization, which is the earliest opportunity to time and optimize a "complete" netlist. The paper first documents that having "oracle knowledge" of the final post-DR parasitics enables post-global routing (GR) optimization to produce improved final timing outcomes. To bridge the gap between GR-based parasitic and timing estimation and post-DR results during post-GR optimization, machine learning (ML)-based models are proposed, including the use of features for macro blockages for accurate predictions for designs with macros. Based on a set of experimental evaluations, it is demonstrated that these models show higher accuracy than GR-based timing estimation. When used during post-GR optimization, the ML-based models show demonstrable improvements in post-DR circuit performance. The methodology is applied to two different tool flows - OpenROAD and a commercial tool flow - and results on 45nm bulk and 12nm FinFET enablements show improvements in post-DR slack metrics without increasing congestion. The models are demonstrated to be generalizable to designs generated under different clock period constraints and are robust to training data with small levels of noise.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- (4 more...)