Law
Merging public elementary schools to reduce racial/ethnic segregation
Landry, Madison, Gillani, Nabeel
Diverse schools can help address implicit biases and increase empathy, mutual respect, and reflective thought by fostering connections between students from different racial/ethnic, socioeconomic, and other backgrounds. Unfortunately, demographic segregation remains rampant in US public schools, despite over 70 years since the passing of federal legislation formally outlawing segregation by race. However, changing how students are assigned to schools can help foster more integrated learning environments. In this paper, we explore "school mergers" as one such under-explored, yet promising, student assignment policy change. School mergers involve merging the school attendance boundaries, or catchment areas, of schools and subsequently changing the grades each school offers. We develop an algorithm to simulate elementary school mergers across 200 large school districts serving 4.5 million elementary school students and find that pairing or tripling schools in this way could reduce racial/ethnic segregation by a median relative 20% -- and as much as nearly 60% in some districts -- while increasing driving times to schools by an average of a few minutes each way. Districts with many interfaces between racially/ethnically-disparate neighborhoods tend to be prime candidates for mergers. We also compare the expected results of school mergers to other typical integration policies, like redistricting, and find that different policies may be more or less suitable in different places. Finally, we make our results available through a public dashboard for policymakers and community members to explore further (https://mergers.schooldiversity.org). Together, our study offers new findings and tools to support integration policy-making across US public school districts.
Organize the Web: Constructing Domains Enhances Pre-Training Data Curation
Wettig, Alexander, Lo, Kyle, Min, Sewon, Hajishirzi, Hannaneh, Chen, Danqi, Soldaini, Luca
Modern language models are trained on large, unstructured datasets consisting of trillions of tokens and obtained by crawling the web. The unstructured nature makes it difficult to reason about their contents and develop systematic approaches to data curation. In this paper, we unpack monolithic web corpora by developing taxonomies of their contents and organizing them into domains. We introduce WebOrganizer, a framework for organizing web pages in terms of both their topic and format. Using these two complementary notions of domains, we automatically annotate pre-training data by distilling annotations from a large language model into efficient classifiers. This allows us to study how data from different domains should be mixed to improve models on downstream tasks, and we show that we can combine insights about effective topics and formats to further boost performance. We demonstrate that our domain mixing also improves existing methods that select data based on quality. Furthermore, we study and compare how quality-based methods will implicitly change the domain mixture. Overall, our work demonstrates that constructing and mixing domains provides a valuable complement to quality-based data curation methods, opening new avenues for effective and insightful pre-training data curation.
OWLS: Scaling Laws for Multilingual Speech Recognition and Translation Models
Chen, William, Tian, Jinchuan, Peng, Yifan, Yan, Brian, Yang, Chao-Han Huck, Watanabe, Shinji
Neural scaling laws offer valuable insights for designing robust sequence processing architectures. While these laws have been extensively characterized in other modalities, their behavior in speech remains comparatively underexplored. In this work, we introduce OWLS, an open-access, reproducible suite of multilingual speech recognition and translation models spanning 0.25B to 18B parameters, with the 18B version being the largest speech model, to the best of our knowledge. OWLS leverages up to 360K hours of public speech data across 150 languages, enabling a systematic investigation into how data, model, and compute scaling each influence performance in multilingual speech tasks. We use OWLS to derive neural scaling laws, showing how final performance can be reliably predicted when scaling. One of our key findings is that scaling enhances performance on low-resource languages/dialects, helping to mitigate bias and improve the accessibility of speech technologies. Finally, we show how OWLS can be used to power new research directions by discovering emergent abilities in large-scale speech models. Model checkpoints will be released on https://huggingface.co/collections/espnet/owls-scaling-laws-for-speech-recognition-and-translation-67ab7f991c194065f057ce8d for future studies.
Hallucinations and Truth: A Comprehensive Accuracy Evaluation of RAG, LoRA and DoRA
Baqar, Mohammad, Khanda, Rajat
Recent advancements in Generative AI have significantly improved the efficiency and adaptability of natural language processing (NLP) systems, particularly through Retrieval-Augmented Generation (RAG), Low-Rank Adaptation (LoRA), and Weight-Decomposed Low-Rank Adaptation (DoRA). RAG integrates external knowledge to enhance factual consistency in generative outputs, while LoRA enables parameter-efficient fine-tuning of large language models (LLMs). DoRA further refines this process by optimizing fine-tuning through adaptive parameter ranking and domain-aware weight adjustments, improving learning efficiency while maintaining inference performance. This paper presents a large-scale empirical evaluation of RAG, LoRA, and DoRA, with model fine-tuning and generation performance assessed on 20,000 FAQ-based queries, while the knowledge base spans 400,000 entries. The study analyzes key performance metrics such as accuracy, relevance, and inference latency. Experimental results demonstrate that DoRA achieves the highest accuracy (90.1%), relevance score (0.88), and lowest latency (110 ms per query), outperforming both LoRA and RAG in real-world, domain-specific generative AI applications. Furthermore, this study examines the trade-offs between fine-tuning efficiency, computational cost, and real-time adaptability across different models. Findings highlight RAG's effectiveness in knowledge grounding, LoRA's cost-efficient domain adaptation, and DoRA's ability to balance fine-tuning efficiency with model precision. These insights provide practical guidance for deploying AI-driven generative systems in accuracy-critical domains such as healthcare, finance, and legal services, ensuring scalability, reliability, and optimal performance in dynamic environments.
Elon Musk says he'll drop his 97bn bid for OpenAI if it remains a non-profit
Elon Musk says he will abandon his 97.4bn offer to buy the non-profit behind OpenAI if the ChatGPT maker drops its plan to convert into a for-profit company. "If OpenAI, Inc's Board is prepared to preserve the charity's mission and stipulate to take the'for sale' sign off its assets by halting its conversion, Musk will withdraw the bid," lawyers for the billionaire said in a filing to a California court on Wednesday. "Otherwise, the charity must be compensated by what an arms-length buyer will pay for its assets." Musk and a group of investors made their offer earlier this week, in the latest twist to a dispute with the artificial intelligence company that he helped found a decade ago. OpenAI is controlled by a non-profit board bound to its original mission of safely building "better-than-human" AI for public benefit.
Major publishers sue AI startup Cohere over copyright infringement
This is another salvo in the ongoing war between the people that make stuff and the AI algorithms that mimic the stuff that people make. Additionally, the startup has been accused of passing off large segments of entire articles to its users without proper attribution. "Rather than create their own content, they're stealing ours to compete with us without our permission, without compensation, and undermining our very business that feeds their machines in the first place," said Danielle Coffey, CEO of the News Media Alliance, which organized the lawsuit on behalf of its members. The suit also says the company has engaged in trademark infringement, suggesting that the algorithm would send articles to users with proper attribution, using the publisher's name, but the article itself would be filled with hallucinated and incorrect information. One example given in the suit involves a piece that The Guardian published about Hamas's attack on the Nova music festival in Israel, only the AI conflated the terror attack with a 2020 shooting in Nova Scotia, Canada. Members of the News Media Alliance are suing the AI company Cohere, accusing it of stealing their journalism without permission to train its generative AI model.
Rape under wraps: how Tinder, Hinge and their corporate owner chose profits over safety
The Dating Apps Reporting Project is an 18-month investigation. It was produced in partnership with the Pulitzer Center's AI Accountability Network and the Markup, now a part of CalMatters, and co-published with the Guardian and the 19th. When a young woman in Denver met up with a smiling cardiologist she matched with on the dating app Hinge, she had no way of knowing that the company behind the app had already received reports from two other women who had accused him of rape. She met the 34-year-old doctor with green eyes and thinning hair at Highland Tap & Burger, a sports bar in a trendy neighborhood. It went well enough that she accepted an invitation to go back to his apartment. As she emerged from his bathroom, he handed her a tequila soda. What transpired over the next 24 hours, according to court testimony, reads like every person's dating app nightmare. After sipping the drink, the woman started to lose control. She fell to the ground, and the man started to film her. He put her in a headlock, kissing her forehead; she struggled to free herself but managed to grab her things and leave. He followed her out the door, holding her shoes and trying to force her back inside, but she was able to call an Uber, vomiting in the car on the way home. She woke up at home, soaking wet on her bathroom floor, the key to her house still in her door. She continued vomiting for hours.
Cracking the Code: Enhancing Development finance understanding with artificial intelligence
Analyzing development projects is crucial for understanding donors aid strategies, recipients priorities, and to assess development finance capacity to adress development issues by on-the-ground actions. In this area, the Organisation for Economic Co-operation and Developments (OECD) Creditor Reporting System (CRS) dataset is a reference data source. This dataset provides a vast collection of project narratives from various sectors (approximately 5 million projects). While the OECD CRS provides a rich source of information on development strategies, it falls short in informing project purposes due to its reporting process based on donors self-declared main objectives and pre-defined industrial sectors. This research employs a novel approach that combines Machine Learning (ML) techniques, specifically Natural Language Processing (NLP), an innovative Python topic modeling technique called BERTopic, to categorise (cluster) and label development projects based on their narrative descriptions. By revealing existing yet hidden topics of development finance, this application of artificial intelligence enables a better understanding of donor priorities and overall development funding and provides methods to analyse public and private projects narratives.
Thompson Sampling for Repeated Newsvendor
Zhang, Weizhou, Li, Chen, Qin, Hanzhang, Xu, Yunbei, Zhu, Ruihao
In this paper, we investigate the performance of Thompson Sampling (TS) for online learning with censored feedback, focusing primarily on the classic repeated newsvendor model--a foundational framework in inventory management--and demonstrating how our techniques can be naturally extended to a broader class of problems. We model demand using a Weibull distribution and initialize TS with a Gamma prior to dynamically adjust order quantities. Our analysis establishes optimal (up to logarithmic factors) frequentist regret bounds for TS without imposing restrictive prior assumptions. More importantly, it yields novel and highly interpretable insights on how TS addresses the exploration-exploitation trade-off in the repeated newsvendor setting. Specifically, our results show that when past order quantities are sufficiently large to overcome censoring, TS accurately estimates the unknown demand parameters, leading to near-optimal ordering decisions. Conversely, when past orders are relatively small, TS automatically increases future order quantities to gather additional demand information. Extensive numerical simulations further demonstrate that TS outperforms more conservative and widely-used approaches such as online convex optimization, upper confidence bounds, and myopic Bayesian dynamic programming. This study also lays the foundation for exploring general online learning problems with censored feedback.
Mind the Gaps: Logical English, Prolog, and Multi-agent Systems for Autonomous Vehicles
Sartor, Galileo, Wyner, Adam, Contissa, Giuseppe
In this paper, we present a modular system for representing and reasoning with legal aspects of traffic rules for autonomous vehicles. We focus on a subset of the United Kingdom's Highway Code (HC) related to junctions. As human drivers and automated vehicles (AVs) will interact on the roads, especially in urban environments, we claim that an accessible, unitary, high-level computational model should exist and be applicable to both users. Autonomous vehicles introduce a shift in liability that should not bring disadvantages or increased burden on human drivers. We develop a system "in silico" of the model. The proposed system is built of three main components: a natural language interface, using Logical English, which encodes the rules; an internal representation of the rules in Prolog; and an multi-agent-based simulation environment, built in NetLogo. The three components interact: Logical English is translated into and out of Prolog (along with some support code); Prolog and NetLogo interface via predicates. Such a modular approach enables the different components to carry different "burdens" in the overall system; it also allows swapping of modules. Given NetLogo, we can visualize the effect of the modeled rules as well as validate the system with a simple dynamic running scenario. Designated agents monitor the behaviour of the vehicles for compliance and record potential violations where they occur. The information on potential violations is then utilized by Validators, to determine whether the violation is punishable, differentiating between exceptions and cases.