Law
Educating a Responsible AI Workforce: Piloting a Curricular Module on AI Policy in a Graduate Machine Learning Course
Weichert, James, Eldardiry, Hoda
As artificial intelligence (AI) technologies begin to permeate diverse fields--from healthcare to education--consumers, researchers and policymakers are increasingly raising concerns about whether and how AI is regulated. It is therefore reasonable to anticipate that alignment with principles of'ethical' or'responsible' AI, as well as compliance with law and policy, will form an increasingly important part of AI development. Yet, for the most part, the conventional computer science curriculum is ill-equipped to prepare students for these challenges. To this end, we seek to explore how new educational content related to AI ethics and AI policy can be integrated into both ethics-and technical-focused courses. This paper describes a two-lecture AI policy module that was piloted in a graduate-level introductory machine learning course in 2024. The module, which includes an in-class active learning game, is evaluated using data from student surveys before and after the lectures, and pedagogical motivations and considerations are discussed. We find that the module is successful in engaging otherwise technically-oriented students on the topic of AI policy, increasing student awareness of the social impacts of a variety of AI technologies and developing student interest in the field of AI regulation. Introduction The explosive growth of artificial intelligence (AI) technologies is widely documented and increasingly evident in everyday life: some responses from the search engine Google now include an "AI Overview" inserted before the first webpage link; companies like Tesla and Waymo have seen success in implementing partial or full autonomous driving in vehicles on live roads; and "Apple Intelligence" was the flagship feature for the launch of Apple's new smartphone in fall 2024. Yet what legal or policy response this technological growth will precipitate is less certain [1, 2]. Nevertheless, it should be expected that the development and enactment of regulatory frameworks for AI will demand AI engineers with a command not only of the technical intricacies of AI models, but also of the policy and regulatory landscape for AI development and compliance [3].
Tractable Transformers for Flexible Conditional Generation
Liu, Anji, Liu, Xuejie, Zhao, Dayuan, Niepert, Mathias, Liang, Yitao, Broeck, Guy Van den
Non-autoregressive (NAR) generative models are valuable because they can handle diverse conditional generation tasks in a more principled way than their autoregressive (AR) counterparts, which are constrained by sequential dependency requirements. Recent advancements in NAR models, such as diffusion language models, have demonstrated superior performance in unconditional generation compared to AR models (e.g., GPTs) of similar sizes. However, such improvements do not always lead to improved conditional generation performance. We show that a key reason for this gap is the difficulty in generalizing to conditional probability queries unseen during training. As a result, strong unconditional generation performance does not guarantee high-quality conditional generation. This paper proposes Tractable Transformers (Tracformer), a Transformer-based generative model that is more robust to different conditional generation tasks. Unlike existing models that rely solely on global contextual features derived from full inputs, Tracformers incorporate a sparse Transformer encoder to capture both local and global contextual information. This information is routed through a decoder for conditional generation. Empirical results demonstrate that Tracformers achieve state-of-the-art conditional generation performance on text modeling compared to recent diffusion and AR model baselines.
LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid
Sun, Weigao, Lan, Disen, Zhong, Yiran, Qu, Xiaoye, Cheng, Yu
Linear sequence modeling approaches, such as linear attention, provide advantages like linear-time training and constant-memory inference over sequence lengths. However, existing sequence parallelism (SP) methods are either not optimized for the right-product-first feature of linear attention or use a ring-style communication strategy, which results in lower computation parallelism, limits their scalability for longer sequences in distributed systems. In this paper, we introduce LASP-2, a new SP method to enhance both communication and computation parallelism when training linear attention transformer models with very-long input sequences. Compared to previous work LASP, LASP-2 rethinks the minimal communication requirement for SP on linear attention layers, reorganizes the whole communication-computation workflow of LASP. In this way, only one single AllGather collective communication is needed on intermediate memory states, whose sizes are independent of the sequence length, leading to significant improvements of both communication and computation parallelism, as well as their overlap. Additionally, we extend LASP-2 to LASP-2H by applying similar communication redesign to standard attention modules, offering an efficient SP solution for hybrid models that blend linear and standard attention layers. Our evaluation on a Linear-Llama3 model, a variant of Llama3 with linear attention replacing standard attention, demonstrates the effectiveness of LASP-2 and LASP-2H. Specifically, LASP-2 achieves training speed improvements of 15.2% over LASP and 36.6% over Ring Attention, with a sequence length of 2048K across 64 GPUs. The Code is released as a part of: https://github.com/OpenSparseLLMs/Linear-MoE.
PolicySimEval: A Benchmark for Evaluating Policy Outcomes through Agent-Based Simulation
Kang, Jiaju, Han, Puyu, Zhang, Tian, Gong, Luqi
With the growing adoption of agent-based models in policy evaluation, a pressing question arises: Can such systems effectively simulate and analyze complex social scenarios to inform policy decisions? Addressing this challenge could significantly enhance the policy-making process, offering researchers and practitioners a systematic way to validate, explore, and refine policy outcomes. To advance this goal, we introduce PolicySimEval, the first benchmark designed to evaluate the capability of agent-based simulations in policy assessment tasks. PolicySimEval aims to reflect the real-world complexities faced by social scientists and policymakers. The benchmark is composed of three categories of evaluation tasks: (1) 20 comprehensive scenarios that replicate end-to-end policy modeling challenges, complete with annotated expert solutions; (2) 65 targeted sub-tasks that address specific aspects of agent-based simulation (e.g., agent behavior calibration); and (3) 200 auto-generated tasks to enable large-scale evaluation and method development. Experiments show that current state-of-the-art frameworks struggle to tackle these tasks effectively, with the highest-performing system achieving only 24.5\% coverage rate on comprehensive scenarios, 15.04\% on sub-tasks, and 14.5\% on auto-generated tasks. These results highlight the difficulty of the task and the gap between current capabilities and the requirements for real-world policy evaluation.
Intelligent Legal Assistant: An Interactive Clarification System for Legal Question Answering
Yao, Rujing, Wu, Yiquan, Zhang, Tong, Zhang, Xuhui, Huang, Yuting, Wu, Yang, Yang, Jiayin, Sun, Changlong, Wang, Fang, Liu, Xiaozhong
The rise of large language models has opened new avenues for users seeking legal advice. However, users often lack professional legal knowledge, which can lead to questions that omit critical information. This deficiency makes it challenging for traditional legal question-answering systems to accurately identify users' actual needs, often resulting in imprecise or generalized advice. In this work, we develop a legal question-answering system called Intelligent Legal Assistant, which interacts with users to precisely capture their needs. When a user poses a question, the system requests that the user select their geographical location to pinpoint the applicable laws. It then generates clarifying questions and options based on the key information missing from the user's initial question. This allows the user to select and provide the necessary details. Once all necessary information is provided, the system produces an in-depth legal analysis encompassing three aspects: overall conclusion, jurisprudential analysis, and resolution suggestions.
Elevating Legal LLM Responses: Harnessing Trainable Logical Structures and Semantic Knowledge with Legal Reasoning
Yao, Rujing, Wu, Yang, Wang, Chenghao, Xiong, Jingwei, Wang, Fang, Liu, Xiaozhong
Large Language Models (LLMs) have achieved impressive results across numerous domains, yet they experience notable deficiencies in legal question-answering tasks. LLMs often generate generalized responses that lack the logical specificity required for expert legal advice and are prone to hallucination, providing answers that appear correct but are unreliable. Retrieval-Augmented Generation (RAG) techniques offer partial solutions to address this challenge, but existing approaches typically focus only on semantic similarity, neglecting the logical structure essential to legal reasoning. In this paper, we propose the Logical-Semantic Integration Model (LSIM), a novel supervised framework that bridges semantic and logical coherence. LSIM comprises three components: reinforcement learning predicts a structured fact-rule chain for each question, a trainable Deep Structured Semantic Model (DSSM) retrieves the most relevant candidate questions by integrating semantic and logical features, and in-context learning generates the final answer using the retrieved content. Our experiments on a real-world legal QA dataset-validated through both automated metrics and human evaluation-demonstrate that LSIM significantly enhances accuracy and reliability compared to existing methods.
Breaking Down Bias: On The Limits of Generalizable Pruning Strategies
Ma, Sibo, Salinas, Alejandro, Henderson, Peter, Nyarko, Julian
We employ model pruning to examine how LLMs conceptualize racial biases, and whether a generalizable mitigation strategy for such biases appears feasible. Our analysis yields several novel insights. We find that pruning can be an effective method to reduce bias without significantly increasing anomalous model behavior. Neuron-based pruning strategies generally yield better results than approaches pruning entire attention heads. However, our results also show that the effectiveness of either approach quickly deteriorates as pruning strategies become more generalized. For instance, a model that is trained on removing racial biases in the context of financial decision-making poorly generalizes to biases in commercial transactions. Overall, our analysis suggests that racial biases are only partially represented as a general concept within language models. The other part of these biases is highly context-specific, suggesting that generalizable mitigation strategies may be of limited effectiveness. Our findings have important implications for legal frameworks surrounding AI. In particular, they suggest that an effective mitigation strategy should include the allocation of legal responsibility on those that deploy models in a specific use case.
Machine Unlearning via Information Theoretic Regularization
How can we effectively remove or "unlearn" undesirable information, such as specific features or individual data points, from a learning outcome while minimizing utility loss and ensuring rigorous guarantees? We introduce a mathematical framework based on information-theoretic regularization to address both feature and data point unlearning. For feature unlearning, we derive a unified solution that simultaneously optimizes diverse learning objectives, including entropy, conditional entropy, KL-divergence, and the energy of conditional probability. For data point unlearning, we first propose a novel definition that serves as a practical condition for unlearning via retraining, is easy to verify, and aligns with the principles of differential privacy from an inference perspective. Then, we provide provable guarantees for our framework on data point unlearning. By combining flexibility in learning objectives with simplicity in regularization design, our approach is highly adaptable and practical for a wide range of machine learning and AI applications.
Overview of the Amphion Toolkit (v0.2)
Li, Jiaqi, Zhang, Xueyao, Wang, Yuancheng, He, Haorui, Wang, Chaoren, Wang, Li, Liao, Huan, Ao, Junyi, Xie, Zeyu, Huang, Yiqiao, Zhang, Junan, Wu, Zhizheng
Amphion is an open-source toolkit for Audio, Music, and Speech Generation, designed to lower the entry barrier for junior researchers and engineers in these fields. It provides a versatile framework that supports a variety of generation tasks and models. In this report, we introduce Amphion v0.2, the second major release developed in 2024. This release features a 100K-hour open-source multilingual dataset, a robust data preparation pipeline, and novel models for tasks such as text-to-speech, audio coding, and voice conversion. Furthermore, the report includes multiple tutorials that guide users through the functionalities and usage of the newly released models.
Roblox, Discord, OpenAI and Google found new child safety group
Roblox, Discord, OpenAI and Google are launching a nonprofit organization called ROOST, or Robust Open Online Safety Tools, which hopes "to build scalable, interoperable safety infrastructure suited for the AI era." The organization plans on providing free, open-source safety tools to public and private organizations to use on their own platforms, with a special focus on child safety to start. The press release announcing ROOST specifically calls out plans to offer "tools to detect, review, and report child sexual abuse material (CSAM)." Partner companies are providing funding for these tools, and the technical expertise to build them, too. The operating theory of ROOST is that access to generative AI is rapidly changing the online landscape, making the need for "reliable and accessible safety infrastructure" all the more urgent.