AITopics

As artificial intelligence (AI) technologies begin to permeate diverse fields--from healthcare to education--consumers, researchers and policymakers are increasingly raising concerns about whether and how AI is regulated. It is therefore reasonable to anticipate that alignment with principles of'ethical' or'responsible' AI, as well as compliance with law and policy, will form an increasingly important part of AI development. Yet, for the most part, the conventional computer science curriculum is ill-equipped to prepare students for these challenges. To this end, we seek to explore how new educational content related to AI ethics and AI policy can be integrated into both ethics-and technical-focused courses. This paper describes a two-lecture AI policy module that was piloted in a graduate-level introductory machine learning course in 2024. The module, which includes an in-class active learning game, is evaluated using data from student surveys before and after the lectures, and pedagogical motivations and considerations are discussed. We find that the module is successful in engaging otherwise technically-oriented students on the topic of AI policy, increasing student awareness of the social impacts of a variety of AI technologies and developing student interest in the field of AI regulation. Introduction The explosive growth of artificial intelligence (AI) technologies is widely documented and increasingly evident in everyday life: some responses from the search engine Google now include an "AI Overview" inserted before the first webpage link; companies like Tesla and Waymo have seen success in implementing partial or full autonomous driving in vehicles on live roads; and "Apple Intelligence" was the flagship feature for the launch of Apple's new smartphone in fall 2024. Yet what legal or policy response this technological growth will precipitate is less certain [1, 2]. Nevertheless, it should be expected that the development and enactment of regulatory frameworks for AI will demand AI engineers with a command not only of the technical intricacies of AI models, but also of the policy and regulatory landscape for AI development and compliance [3].

artificial intelligence, machine learning, student, (17 more...)

2502.07931

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Virginia (0.04)
Asia > China (0.04)
(7 more...)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Law > Statutes (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Tractable Transformers for Flexible Conditional Generation

Liu, Anji, Liu, Xuejie, Zhao, Dayuan, Niepert, Mathias, Liang, Yitao, Broeck, Guy Van den

Non-autoregressive (NAR) generative models are valuable because they can handle diverse conditional generation tasks in a more principled way than their autoregressive (AR) counterparts, which are constrained by sequential dependency requirements. Recent advancements in NAR models, such as diffusion language models, have demonstrated superior performance in unconditional generation compared to AR models (e.g., GPTs) of similar sizes. However, such improvements do not always lead to improved conditional generation performance. We show that a key reason for this gap is the difficulty in generalizing to conditional probability queries unseen during training. As a result, strong unconditional generation performance does not guarantee high-quality conditional generation. This paper proposes Tractable Transformers (Tracformer), a Transformer-based generative model that is more robust to different conditional generation tasks. Unlike existing models that rely solely on global contextual features derived from full inputs, Tracformers incorporate a sparse Transformer encoder to capture both local and global contextual information. This information is routed through a decoder for conditional generation. Empirical results demonstrate that Tracformers achieve state-of-the-art conditional generation performance on text modeling compared to recent diffusion and AR model baselines.

large language model, machine learning, tracformer, (20 more...)

2502.07616

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts (0.04)
Africa > Zimbabwe (0.04)
(9 more...)

Genre: Research Report > New Finding (0.47)

Industry:

Law (0.92)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Government > Military (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid

Sun, Weigao, Lan, Disen, Zhong, Yiran, Qu, Xiaoye, Cheng, Yu

Linear sequence modeling approaches, such as linear attention, provide advantages like linear-time training and constant-memory inference over sequence lengths. However, existing sequence parallelism (SP) methods are either not optimized for the right-product-first feature of linear attention or use a ring-style communication strategy, which results in lower computation parallelism, limits their scalability for longer sequences in distributed systems. In this paper, we introduce LASP-2, a new SP method to enhance both communication and computation parallelism when training linear attention transformer models with very-long input sequences. Compared to previous work LASP, LASP-2 rethinks the minimal communication requirement for SP on linear attention layers, reorganizes the whole communication-computation workflow of LASP. In this way, only one single AllGather collective communication is needed on intermediate memory states, whose sizes are independent of the sequence length, leading to significant improvements of both communication and computation parallelism, as well as their overlap. Additionally, we extend LASP-2 to LASP-2H by applying similar communication redesign to standard attention modules, offering an efficient SP solution for hybrid models that blend linear and standard attention layers. Our evaluation on a Linear-Llama3 model, a variant of Llama3 with linear attention replacing standard attention, demonstrates the effectiveness of LASP-2 and LASP-2H. Specifically, LASP-2 achieves training speed improvements of 15.2% over LASP and 36.6% over Ring Attention, with a sequence length of 2048K across 64 GPUs. The Code is released as a part of: https://github.com/OpenSparseLLMs/Linear-MoE.

large language model, machine learning, natural language, (18 more...)

2502.07563

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > Singapore (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry: Law (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

PolicySimEval: A Benchmark for Evaluating Policy Outcomes through Agent-Based Simulation

Kang, Jiaju, Han, Puyu, Zhang, Tian, Gong, Luqi

With the growing adoption of agent-based models in policy evaluation, a pressing question arises: Can such systems effectively simulate and analyze complex social scenarios to inform policy decisions? Addressing this challenge could significantly enhance the policy-making process, offering researchers and practitioners a systematic way to validate, explore, and refine policy outcomes. To advance this goal, we introduce PolicySimEval, the first benchmark designed to evaluate the capability of agent-based simulations in policy assessment tasks. PolicySimEval aims to reflect the real-world complexities faced by social scientists and policymakers. The benchmark is composed of three categories of evaluation tasks: (1) 20 comprehensive scenarios that replicate end-to-end policy modeling challenges, complete with annotated expert solutions; (2) 65 targeted sub-tasks that address specific aspects of agent-based simulation (e.g., agent behavior calibration); and (3) 200 auto-generated tasks to enable large-scale evaluation and method development. Experiments show that current state-of-the-art frameworks struggle to tackle these tasks effectively, with the highest-performing system achieving only 24.5\% coverage rate on comprehensive scenarios, 15.04\% on sub-tasks, and 14.5\% on auto-generated tasks. These results highlight the difficulty of the task and the gap between current capabilities and the requirements for real-world policy evaluation.

artificial intelligence, llama 3, machine learning, (18 more...)

2502.07853

Country:

North America > United States > District of Columbia > Washington (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)
Asia > Thailand (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Intelligent Legal Assistant: An Interactive Clarification System for Legal Question Answering

Yao, Rujing, Wu, Yiquan, Zhang, Tong, Zhang, Xuhui, Huang, Yuting, Wu, Yang, Yang, Jiayin, Sun, Changlong, Wang, Fang, Liu, Xiaozhong

The rise of large language models has opened new avenues for users seeking legal advice. However, users often lack professional legal knowledge, which can lead to questions that omit critical information. This deficiency makes it challenging for traditional legal question-answering systems to accurately identify users' actual needs, often resulting in imprecise or generalized advice. In this work, we develop a legal question-answering system called Intelligent Legal Assistant, which interacts with users to precisely capture their needs. When a user poses a question, the system requests that the user select their geographical location to pinpoint the applicable laws. It then generates clarifying questions and options based on the key information missing from the user's initial question. This allows the user to select and provide the necessary details. Once all necessary information is provided, the system produces an in-depth legal analysis encompassing three aspects: overall conclusion, jurisprudential analysis, and resolution suggestions.

information, natural language, question answering, (17 more...)

2502.07904

Country:

Oceania > Australia > New South Wales > Sydney (0.06)
Asia > China > Zhejiang Province > Hangzhou (0.06)
Asia > China > Tianjin Province > Tianjin (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.64)

Industry: Law (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Elevating Legal LLM Responses: Harnessing Trainable Logical Structures and Semantic Knowledge with Legal Reasoning

Yao, Rujing, Wu, Yang, Wang, Chenghao, Xiong, Jingwei, Wang, Fang, Liu, Xiaozhong

Large Language Models (LLMs) have achieved impressive results across numerous domains, yet they experience notable deficiencies in legal question-answering tasks. LLMs often generate generalized responses that lack the logical specificity required for expert legal advice and are prone to hallucination, providing answers that appear correct but are unreliable. Retrieval-Augmented Generation (RAG) techniques offer partial solutions to address this challenge, but existing approaches typically focus only on semantic similarity, neglecting the logical structure essential to legal reasoning. In this paper, we propose the Logical-Semantic Integration Model (LSIM), a novel supervised framework that bridges semantic and logical coherence. LSIM comprises three components: reinforcement learning predicts a structured fact-rule chain for each question, a trainable Deep Structured Semantic Model (DSSM) retrieves the most relevant candidate questions by integrating semantic and logical features, and in-context learning generates the final answer using the retrieved content. Our experiments on a real-world legal QA dataset-validated through both automated metrics and human evaluation-demonstrate that LSIM significantly enhances accuracy and reliability compared to existing methods.

large language model, machine learning, natural language, (19 more...)

2502.07912

Country:

North America > United States > California > Yolo County > Davis (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Ma, Sibo, Salinas, Alejandro, Henderson, Peter, Nyarko, Julian

Breaking Down Bias: On The Limits of Generalizable Pruning Strategies

We employ model pruning to examine how LLMs conceptualize racial biases, and whether a generalizable mitigation strategy for such biases appears feasible. Our analysis yields several novel insights. We find that pruning can be an effective method to reduce bias without significantly increasing anomalous model behavior. Neuron-based pruning strategies generally yield better results than approaches pruning entire attention heads. However, our results also show that the effectiveness of either approach quickly deteriorates as pruning strategies become more generalized. For instance, a model that is trained on removing racial biases in the context of financial decision-making poorly generalizes to biases in commercial transactions. Overall, our analysis suggests that racial biases are only partially represented as a general concept within language models. The other part of these biases is highly context-specific, suggesting that generalizable mitigation strategies may be of limited effectiveness. Our findings have important implications for legal frameworks surrounding AI. In particular, they suggest that an effective mitigation strategy should include the allocation of legal responsibility on those that deploy models in a specific use case.

large language model, machine learning, variation, (18 more...)

2502.07771

Country:

North America > United States (0.68)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China (0.04)
Africa (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government > Regional Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Xu, Shizhou, Strohmer, Thomas

Machine Unlearning via Information Theoretic Regularization

arXiv.org Machine LearningFeb-11-2025

How can we effectively remove or "unlearn" undesirable information, such as specific features or individual data points, from a learning outcome while minimizing utility loss and ensuring rigorous guarantees? We introduce a mathematical framework based on information-theoretic regularization to address both feature and data point unlearning. For feature unlearning, we derive a unified solution that simultaneously optimizes diverse learning objectives, including entropy, conditional entropy, KL-divergence, and the energy of conditional probability. For data point unlearning, we first propose a novel definition that serves as a practical condition for unlearning via retraining, is easy to verify, and aligns with the principles of differential privacy from an inference perspective. Then, we provide provable guarantees for our framework on data point unlearning. By combining flexibility in learning objectives with simplicity in regularization design, our approach is highly adaptable and practical for a wide range of machine learning and AI applications.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

2502.05684

Country:

North America > United States > California > Yolo County > Davis (0.14)
Europe (0.14)
North America > United States > New York (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Overview of the Amphion Toolkit (v0.2)

Li, Jiaqi, Zhang, Xueyao, Wang, Yuancheng, He, Haorui, Wang, Chaoren, Wang, Li, Liao, Huan, Ao, Junyi, Xie, Zeyu, Huang, Yiqiao, Zhang, Junan, Wu, Zhizheng

Amphion is an open-source toolkit for Audio, Music, and Speech Generation, designed to lower the entry barrier for junior researchers and engineers in these fields. It provides a versatile framework that supports a variety of generation tasks and models. In this report, we introduce Amphion v0.2, the second major release developed in 2024. This release features a 100K-hour open-source multilingual dataset, a robust data preparation pipeline, and novel models for tasks such as text-to-speech, audio coding, and voice conversion. Furthermore, the report includes multiple tutorials that guide users through the functionalities and usage of the newly released models.

large language model, machine learning, natural language, (21 more...)

2501.15442

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report > Promising Solution (0.33)

Industry:

Media (1.00)
Law > Government & the Courts (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
(4 more...)

EngadgetFeb-10-2025, 19:44:45 GMT

Roblox, Discord, OpenAI and Google found new child safety group

Roblox, Discord, OpenAI and Google are launching a nonprofit organization called ROOST, or Robust Open Online Safety Tools, which hopes "to build scalable, interoperable safety infrastructure suited for the AI era." The organization plans on providing free, open-source safety tools to public and private organizations to use on their own platforms, with a special focus on child safety to start. The press release announcing ROOST specifically calls out plans to offer "tools to detect, review, and report child sexual abuse material (CSAM)." Partner companies are providing funding for these tools, and the technical expertise to build them, too. The operating theory of ROOST is that access to generative AI is rapidly changing the online landscape, making the need for "reliable and accessible safety infrastructure" all the more urgent.

large language model, machine learning, natural language, (12 more...)

Engadget

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.60)
Law (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.99)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.66)