Goto

Collaborating Authors

 Law


Nine Ways to Break Copyright Law and Why Our LLM Won't: A Fair Use Aligned Generation Framework

arXiv.org Artificial Intelligence

Large language models (LLMs) commonly risk copyright infringement by reproducing protected content verbatim or with insufficient transformative modifications, posing significant ethical, legal, and practical concerns. Current inference-time safeguards predominantly rely on restrictive refusal-based filters, often compromising the practical utility of these models. To address this, we collaborated closely with intellectual property experts to develop FUA-LLM (Fair Use Aligned Language Models), a legally-grounded framework explicitly designed to align LLM outputs with fair-use doctrine. Central to our method is FairUseDB, a carefully constructed dataset containing 18,000 expert-validated examples covering nine realistic infringement scenarios. Leveraging this dataset, we apply Direct Preference Optimization (DPO) to fine-tune open-source LLMs, encouraging them to produce legally compliant and practically useful alternatives rather than resorting to blunt refusal. Recognizing the shortcomings of traditional evaluation metrics, we propose new measures: Weighted Penalty Utility and Compliance Aware Harmonic Mean (CAH) to balance infringement risk against response utility. Extensive quantitative experiments coupled with expert evaluations confirm that FUA-LLM substantially reduces problematic outputs (up to 20\%) compared to state-of-the-art approaches, while preserving real-world usability.


Reality Check: A New Evaluation Ecosystem Is Necessary to Understand AI's Real World Effects

arXiv.org Artificial Intelligence

Conventional AI evaluation approaches concentrated within the AI stack exhibit systemic limitations for exploring, navigating and resolving the human and societal factors that play out in real world deployment such as in education, finance, healthcare, and employment sectors. AI capability evaluations can capture detail about first-order effects, such as whether immediate system outputs are accurate, or contain toxic, biased or stereotypical content, but AI's second-order effects, i.e. any long-term outcomes and consequences that may result from AI use in the real world, have become a significant area of interest as the technology becomes embedded in our daily lives. These secondary effects can include shifts in user behavior, societal, cultural and economic ramifications, workforce transformations, and long-term downstream impacts that may result from a broad and growing set of risks. This position paper argues that measuring the indirect and secondary effects of AI will require expansion beyond static, single-turn approaches conducted in silico to include testing paradigms that can capture what actually materializes when people use AI technology in context. Specifically, we describe the need for data and methods that can facilitate contextual awareness and enable downstream interpretation and decision making about AI's secondary effects, and recommend requirements for a new ecosystem.


US lawyer sanctioned after caught using ChatGPT for court brief

The Guardian

The Utah court of appeals has sanctioned a lawyer after he was discovered to have used ChatGPT for a filing he made in which he referenced a nonexistent court case. Earlier this week, the Utah court of appeals made the decision to sanction Richard Bednar over claims that he filed a brief which included false citations. According to court documents reviewed by ABC4, Bednar and Douglas Durbano, another Utah-based lawyer who was serving as the petitioner's counsel, filed a "timely petition for interlocutory appeal". Upon reviewing the brief which was written by a law clerk, the respondent's counsel found several false citations of cases. "It appears that at least some portions of the Petition may be AI-generated, including citations and even quotations to at least one case that does not appear to exist in any legal database (and could only be found in ChatGPT and references to cases that are wholly unrelated to the referenced subject matter," the respondent's counsel said in documents reviewed by ABC4.


Review for NeurIPS paper: Predictive coding in balanced neural networks with noise, chaos and delays

Neural Information Processing Systems

Additional Feedback: Minor comments: l. 87: "were" - "where" l.128: the relation to E-I balanced networks could be made more explicit. In some versions of those networks, there are also two independent effective parameters that scale separately the negative feedback and the variance of the connectivity (see e.g. Mastrogiuseppe and Ostojic 2017) l. 223 "the full solution for the chaotic system is highly involved" - the solution for adiabatic inputs seems to be available from Ref.23, but perhaps the situation here is different? My understanding is that we are here in the adiabatic limit, not in the case of Ref 38? In the adiabatic case, why does the (finite) correlation timescale of the noise matter for coding?


US gov't and Google face off in search monopoly case

Al Jazeera

Google has been back in federal court to fend off the United States Department of Justice's attempt to topple its internet empire at the same time it is navigating a pivotal shift to artificial intelligence (AI) that could undercut its power. On Friday, the legal and technological threats facing Google were among the key issues being dissected during the closing arguments of a legal proceeding that will determine the changes imposed upon the company in the wake of its dominant search engine being declared an illegal monopoly by US District Judge Amit Mehta last year. Brandishing evidence presented during a recent three-week stretch of hearings, Justice Department lawyers are attempting to persuade Mehta to order a radical shake-up that includes a ban on Google paying to lock its search engine in as the default on smart devices and an order requiring the company to sell its Chrome browser. Google lawyers say only minor concessions are needed, especially as the upheaval triggered by advances in artificial intelligence already are reshaping the search landscape, as alternative, conversational search options are rolling out from AI startups that are hoping to use the Department of Justice's four-and-half-year-old case to gain the upper hand in the next technological frontier. Mehta used Friday's hearing to ask probing and pointed questions to lawyers for both sides while hinting that he was seeking a middle ground between the two camps' proposed remedies.


How the Loudest Voices in AI Went From 'Regulate Us' to 'Unleash Us'

WIRED

On May 16, 2023, Sam Altman appeared before a subcommittee of the Senate Judiciary. The title of the hearing was "Oversight of AI." The session was a lovefest, with both Altman and the senators celebrating what Altman called AI's "printing press moment"--and acknowledging that the US needed strong laws to avoid its pitfalls. "We think that regulatory intervention by governments will be critical to mitigate the risks of increasingly powerful models," he said. The legislators hung on Altman's every word as he gushed about how smart laws could allow AI to flourish--but only within firm guidelines that both lawmakers and AI builders deemed vital at that moment.


OpenAI Can Stop Pretending

The Atlantic - Technology

OpenAI is a strange company for strange times. Valued at 300 billion--roughly the same as seven Fords or one and a half PepsiCos--the AI start-up has an era-defining product in ChatGPT and is racing to be the first to build superintelligent machines. The company is also, to the apparent frustration of its CEO Sam Altman, beholden to its nonprofit status. When OpenAI was founded in 2015, it was meant to be a research lab that would work toward the goal of AI that is "safe" and "benefits all of humanity." There wasn't supposed to be any pressure--or desire, really--to make money.


The Strong, Weak and Benign Goodhart's law. An independence-free and paradigm-agnostic formalisation

arXiv.org Machine Learning

Goodhart's law is a famous adage in policy-making that states that ``When a measure becomes a target, it ceases to be a good measure''. As machine learning models and the optimisation capacity to train them grow, growing empirical evidence reinforced the belief in the validity of this law without however being formalised. Recently, a few attempts were made to formalise Goodhart's law, either by categorising variants of it, or by looking at how optimising a proxy metric affects the optimisation of an intended goal. In this work, we alleviate the simplifying independence assumption, made in previous works, and the assumption on the learning paradigm made in most of them, to study the effect of the coupling between the proxy metric and the intended goal on Goodhart's law. Our results show that in the case of light tailed goal and light tailed discrepancy, dependence does not change the nature of Goodhart's effect. However, in the light tailed goal and heavy tailed discrepancy case, we exhibit an example where over-optimisation occurs at a rate inversely proportional to the heavy tailedness of the discrepancy between the goal and the metric. %


The Aloe Family Recipe for Open and Specialized Healthcare LLMs

arXiv.org Artificial Intelligence

Purpose: With advancements in Large Language Models (LLMs) for healthcare, the need arises for competitive open-source models to protect the public interest. This work contributes to the field of open medical LLMs by optimizing key stages of data preprocessing and training, while showing how to improve model safety (through DPO) and efficacy (through RAG). The evaluation methodology used, which includes four different types of tests, defines a new standard for the field. The resultant models, shown to be competitive with the best private alternatives, are released with a permisive license. Methods: Building on top of strong base models like Llama 3.1 and Qwen 2.5, Aloe Beta uses a custom dataset to enhance public data with synthetic Chain of Thought examples. The models undergo alignment with Direct Preference Optimization, emphasizing ethical and policy-aligned performance in the presence of jailbreaking attacks. Evaluation includes close-ended, open-ended, safety and human assessments, to maximize the reliability of results. Results: Recommendations are made across the entire pipeline, backed by the solid performance of the Aloe Family. These models deliver competitive performance across healthcare benchmarks and medical fields, and are often preferred by healthcare professionals. On bias and toxicity, the Aloe Beta models significantly improve safety, showing resilience to unseen jailbreaking attacks. For a responsible release, a detailed risk assessment specific to healthcare is attached to the Aloe Family models. Conclusion: The Aloe Beta models, and the recipe that leads to them, are a significant contribution to the open-source medical LLM field, offering top-of-the-line performance while maintaining high ethical requirements. This work sets a new standard for developing and reporting aligned LLMs in healthcare.


From Connectivity to Autonomy: The Dawn of Self-Evolving Communication Systems

arXiv.org Artificial Intelligence

This paper envisions 6G as a self-evolving telecom ecosystem, where AI-driven intelligence enables dynamic adaptation beyond static connectivity. We explore the key enablers of autonomous communication systems, spanning reconfigurable infrastructure, adaptive middleware, and intelligent network functions, alongside multi-agent collaboration for distributed decision-making. We explore how these methodologies align with emerging industrial IoT frameworks, ensuring seamless integration within digital manufacturing processes. Our findings emphasize the potential for improved real-time decision-making, optimizing efficiency, and reducing latency in networked control systems. The discussion addresses ethical challenges, research directions, and standardization efforts, concluding with a technology stack roadmap to guide future developments. By leveraging state-of-the-art 6G network management techniques, this research contributes to the next generation of intelligent automation solutions, bridging the gap between theoretical advancements and real-world industrial applications.