Law
Addressing the alignment problem in transportation policy making: an LLM approach
Yan, Xiaoyu, Dai, Tianxing, Nie, Yu Marco
A key challenge in transportation planning is that the collective preferences of heterogeneous travelers often diverge from the policies produced by model-driven decision tools. This misalignment frequently results in implementation delays or failures. Here, we investigate whether large language models (LLMs)--noted for their capabilities in reasoning and simulating human decision-making--can help inform and address this alignment problem. We develop a multi-agent simulation in which LLMs, acting as agents representing residents from different communities in a city, participate in a referendum on a set of transit policy proposals. Using chain-of-thought reasoning, LLM agents provide Ranked-Choice or approval-based preferences, which are aggregated using instant-runoff voting (IRV) to model democratic consensus. We implement this simulation framework with both GPT-4o and Claude-3.5, and apply it for Chicago and Houston. Our findings suggest that LLM agents are capable of approximating plausible collective preferences and responding to local context, while also displaying model-specific behavioral biases and modest divergences from optimization-based benchmarks. These capabilities underscore both promise and limitations of LLMs as tools for solving the alignment problem in transportation decision-making. Introduction Urban transportation policy plays a central role in shaping regional development. Designing effective policy requires access to multidimensional data and a deep understanding of individual preferences across heterogeneous communities. Conventional approaches typically rely on structured mathematical models that identify an optimal policy under specified objectives and constraints. However, these models often rest on rigid assumptions and oversimplified behavioral representations. As a result, they may produce solutions that are analytically tractable yet poorly aligned with public sentiment or the complex realities of policy implementation. This misalignment frequently contributes to delays--or even failures--in policy approval and execution. Trained on vast corpora of text encompassing news, facts, and human discourse, LLMs possess a rich contextual understanding that could potentially help policymakers infer public preferences and explore trade-offs before implementation. Their ability to interpret unstructured information, reason about competing objectives in natural language, and adapt to specific contexts suggests a new form of decision support that complements the traditional paradigm. In this study, we implement a multi-agent voting framework to examine the potential of LLMs in supporting transportation policy design.
Riemannian-Geometric Fingerprints of Generative Models
Recent breakthroughs and rapid integration of generative models (GMs) have sparked interest in the problem of model attribution and their fingerprints. For instance, service providers need reliable methods of authenticating their models to protect their IP, while users and law enforcement seek to verify the source of generated content for accountability and trust. In addition, a growing threat of model collapse is arising, as more model-generated data are being fed back into sources (e.g., YouTube) that are often harvested for training ("regurgitative training"), heightening the need to differentiate synthetic from human data. Yet, a gap still exists in understanding generative models' fingerprints, we believe, stemming from the lack of a formal framework that can define, represent, and analyze the fingerprints in a principled way. To address this gap, we take a geometric approach and propose a new definition of artifact and fingerprint of GMs using Riemannian geometry, which allows us to leverage the rich theory of differential geometry. Our new definition generalizes previous work (Song et al., 2024) to non-Euclidean manifolds by learning Riemannian metrics from data and replacing the Euclidean distances and nearest-neighbor search with geodesic distances and kNN-based Riemannian center of mass. We apply our theory to a new gradient-based algorithm for computing the fingerprints in practice. Results show that it is more effective in distinguishing a large array of GMs, spanning across 4 different datasets in 2 different resolutions (64 by 64, 256 by 256), 27 model architectures, and 2 modalities (Vision, Vision-Language). Using our proposed definition significantly improves the performance on model attribution, as well as a generalization to unseen datasets, model types, and modalities, suggesting its practical efficacy.
Apollo: A Posteriori Label-Only Membership Inference Attack Towards Machine Unlearning
Tang, Liou, Joshi, James, Kundu, Ashish
Machine Unlearning (MU) aims to update Machine Learning (ML) models following requests to remove training samples and their influences on a trained model efficiently without retraining the original ML model from scratch. While MU itself has been employed to provide privacy protection and regulatory compliance, it can also increase the attack surface of the model. Existing privacy inference attacks towards MU that aim to infer properties of the unlearned set rely on the weaker threat model that assumes the attacker has access to both the unlearned model and the original model, limiting their feasibility toward real-life scenarios. We propose a novel privacy attack, A Posteriori Label-Only Membership Inference Attack towards MU, Apollo, that infers whether a data sample has been unlearned, following a strict threat model where an adversary has access to the label-output of the unlearned model only. We demonstrate that our proposed attack, while requiring less access to the target model compared to previous attacks, can achieve relatively high precision on the membership status of the unlearned samples.
The Hawthorne Effect in Reasoning Models: Evaluating and Steering Test Awareness
Abdelnabi, Sahar, Salem, Ahmed
Reasoning-focused LLMs sometimes alter their behavior when they detect that they are being evaluated, which can lead them to optimize for test-passing performance or to comply more readily with harmful prompts if real-world consequences appear absent. We present the first quantitative study of how such "test awareness" impacts model behavior, particularly its performance on safety-related tasks. We introduce a white-box probing framework that (i) linearly identifies awareness-related activations and (ii) steers models toward or away from test awareness while monitoring downstream performance. We apply our method to different state-of-the-art open-weight reasoning LLMs across both realistic and hypothetical tasks (denoting tests or simulations). Our results demonstrate that test awareness significantly impacts safety alignment (such as compliance with harmful requests and conforming to stereotypes) with effects varying in both magnitude and direction across models. By providing control over this latent effect, our work aims to provide a stress-test mechanism and increase trust in how we perform safety evaluations.
Understanding Fairness and Prediction Error through Subspace Decomposition and Influence Analysis
Shi, Enze, Bhagwat, Pankaj, Yang, Zhixian, Kong, Linglong, Jiang, Bei
Machine learning models have achieved widespread success but often inherit and amplify historical biases, resulting in unfair outcomes. Traditional fairness methods typically impose constraints at the prediction level, without addressing underlying biases in data representations. In this work, we propose a principled framework that adjusts data representations to balance predictive utility and fairness. Using sufficient dimension reduction, we decompose the feature space into target-relevant, sensitive, and shared components, and control the fairness-utility trade-off by selectively removing sensitive information. We provide a theoretical analysis of how prediction error and fairness gaps evolve as shared subspaces are added, and employ influence functions to quantify their effects on the asymptotic behavior of parameter estimates. Experiments on both synthetic and real-world datasets validate our theoretical insights and show that the proposed method effectively improves fairness while preserving predictive performance.
Assessing the robustness of heterogeneous treatment effects in survival analysis under informative censoring
Wang, Yuxin, Frauen, Dennis, Schweisthal, Jonas, Schrรถder, Maresa, Feuerriegel, Stefan
Dropout is common in clinical studies, with up to half of patients leaving early due to side effects or other reasons. When dropout is informative (i.e., dependent on survival time), it introduces censoring bias, because of which treatment effect estimates are also biased. In this paper, we propose an assumption-lean framework to assess the robustness of conditional average treatment effect (CATE) estimates in survival analysis when facing censoring bias. Unlike existing works that rely on strong assumptions, such as non-informative censoring, to obtain point estimation, we use partial identification to derive informative bounds on the CATE. Thereby, our framework helps to identify patient subgroups where treatment is effective despite informative censoring. We further develop a novel meta-learner that estimates the bounds using arbitrary machine learning models and with favorable theoretical properties, including double robustness and quasi-oracle efficiency. We demonstrate the practical value of our meta-learner through numerical experiments and in an application to a cancer drug trial. Together, our framework offers a practical tool for assessing the robustness of estimated treatment effects in the presence of censoring and thus promotes the reliable use of survival data for evidence generation in medicine and epidemiology.
What Elon Musk's Version of Wikipedia Thinks About Hitler, Putin, and Apartheid
What does Elon Musk want the world to know about "white genocide theory"? Because he's been vocal about the issue in the past-- advancing the idea, for example, that Jews are pushing "hatred against whites"--I decided to search for the term on Grokipedia, the competitor to Wikipedia that Musk launched yesterday. First, the site uses just that term,, rather than, as you would see on Wikipedia and elsewhere. Just a few sentences in, Grokipedia provides the "empirical underpinnings" of this supposed campaign to eliminate white people of European descent around the world. And the site argues that conversation about this purported genocide is systematically suppressed by the media and academia, which are "prone to ideological biases favoring multiculturalism" and "relegate the theory to fringe conspiracy status despite the observable data on population trajectories."
OpenAI Completes Major Reorganization With 135 Billion Microsoft Stake
An illustration photo shows the OpenAI logo displayed on a smartphone with the Microsoft logo in the background in Chongqing, China on Aug. 27, 2025. An illustration photo shows the OpenAI logo displayed on a smartphone with the Microsoft logo in the background in Chongqing, China on Aug. 27, 2025. OpenAI has completed a restructuring, dividing itself into a nonprofit and for-profit entity, the company announced on Tuesday. The nonprofit arm, now called the OpenAI Foundation, will have a $130 billion stake in the for-profit enterprise, a public benefit corporation called OpenAI Group PBC. "The OpenAI Foundation and OpenAI Group will work in concert to advance solutions to hard problems and opportunities posed by AI progress," the company said in its blog post announcing the restructuring. "This includes making intelligence a tool that everyone can benefit from, building safe and aligned systems, turbocharging scientific discovery, and strengthening global cooperation and resilience."
A New Bill Would Prohibit Minors from Using AI Chatbots
Pillay is an editorial fellow at TIME. Pillay is an editorial fellow at TIME. If you or someone you know may be experiencing a mental-health crisis or contemplating suicide, call or text 988. In emergencies, call 911, or seek care from a local hospital or mental health provider. A new bill introduced in Congress today would require anyone who owns, operates, or otherwise enables access to AI chatbots in the United States to verify the age of their users--and, if users are found to be minors, to prohibit them from using AI companions.