Law
Deciding how to respond: A deliberative framework to guide policymaker responses to AI systems
The discourse on responsible artificial intelligence (AI) regulation is understandably dominated by risk-focused assessments and analyses. This approach reflects the fundamental uncertainty policymakers face when determining appropriate responses to current, emerging and novel AI systems. In this article, we argue that by operationalising the concept of freedom - the philosophical counterpart to responsibility - a complementary approach centred on the potential societal benefits of AI systems can be developed. The result is a discursive framework grounded in freedom as capability and freedom as opportunity, which represent the two main intellectual traditions of interpreting freedom. We contend that the complexity, ambiguity and contestation involved in regulating AI systems make a deliberative paradigm more useful than the conventional technical one. The resulting framework is structured around coordinative, communicative and decision spaces, each with sequential focal points and associated outputs.
Defining, Understanding, and Detecting Online Toxicity: Challenges and Machine Learning Approaches
Shahi, Gautam Kishore, Majchrzak, Tim A.
Online toxic content has grown into a pervasive phenomenon, intensifying during times of crisis, elections, and social unrest. A significant amount of research has been focused on detecting or analyzing toxic content using machine-learning approaches. The proliferation of toxic content across digital platforms has spurred extensive research into automated detection mechanisms, primarily driven by advances in machine learning and natural language processing. Overall, the present study represents the synthesis of 140 publications on different types of toxic content on digital platforms. We present a comprehensive overview of the datasets used in previous studies focusing on definitions, data sources, challenges, and machine learning approaches employed in detecting online toxicity, such as hate speech, offensive language, and harmful discourse. The dataset encompasses content in 32 languages, covering topics such as elections, spontaneous events, and crises. We examine the possibility of using existing cross-platform data to improve the performance of classification models. We present the recommendations and guidelines for new research on online toxic consent and the use of content moderation for mitigation. Finally, we present some practical guidelines to mitigate toxic content from online platforms.
OpenAI Acknowledges the Teen Problem
OpenAI CEO Sam Altman promises that parental controls and age verification are coming to ChatGPT--though the announcement is scant on specifics. Listen to more stories on the Noa app. On Tuesday afternoon, three parents sat in a row before the Senate Judiciary Subcommittee on Crime and Counterterrorism. Two of them had each recently lost a child to suicide; the third has a teenage son who, after cutting his arm in front of her and biting her, is undergoing residential treatment. All three blame generative AI for what has happened to their children.
Brendan Carr Isn't Going to Stop Until Someone Makes Him
In the wake of Jimmy Kimmel's suspension, experts say the FCC commissioner's conduct is flatly unconstitutional. They also expect him to keep going. Brendan Carr speaks in Washington, DC, on September 9, 2025. In what has become an all-too-regular display from Brendan Carr, the Federal Communications Commission chairman used a podcast appearance Wednesday to flex his regulatory power. In this instance, he threatened action against broadcasters that refused to punish Jimmy Kimmel for remarks he made on his ABC show Monday night.
Anti-Trump Protesters Take Aim at 'Naive' US-UK AI Deal
Anti-Trump Protesters Take Aim at'Naive' US-UK AI Deal Thousands marched in London to protest President Donald Trump's second state visit. Among them were many environmental activists unhappy with Britain's new AI deal with the US. They played extremely loud music. They let off foul-smelling smoke from a can. Thousands of people gathered on Wednesday in central London to protest against Trump's presence in the UK, accusing the UK government of kowtowing to him by hosting him for a state visit for the second time.
FinCoT: Grounding Chain-of-Thought in Expert Financial Reasoning
Nitarach, Natapong, Sirichotedumrong, Warit, Pitchayarthorn, Panop, Taveekitworachai, Pittawat, Manakul, Potsawee, Pipatanakul, Kunat
This paper presents FinCoT, a structured chain-of-thought (CoT) prompting framework that embeds domain-specific expert financial reasoning blueprints to guide large language models' behaviors. We identify three main prompting styles in financial NLP (FinNLP): (1) standard prompting (zero-shot), (2) unstructured CoT (free-form reasoning), and (3) structured CoT (with explicitly structured reasoning steps). Prior work has mainly focused on the first two, while structured CoT remains underexplored and lacks domain expertise incorporation. Therefore, we evaluate all three prompting approaches across ten CFA-style financial domains and introduce FinCoT as the first structured finance-specific prompting approach incorporating blueprints from domain experts. FinCoT improves the accuracy of a general-purpose model, Qwen3-8B-Base, from 63.2% to 80.5%, and boosts Fin-R1 (7B), a finance-specific model, from 65.7% to 75.7%, while reducing output length by up to 8.9x and 1.16x compared to structured CoT methods, respectively. We find that FinCoT proves most effective for models lacking financial post-training. Our findings show that FinCoT does not only improve performance and reduce inference costs but also yields more interpretable and expert-aligned reasoning traces.
Assessing Large Language Models on Islamic Legal Reasoning: Evidence from Inheritance Law Evaluation
Bouchekif, Abdessalam, Rashwani, Samer, Sbahi, Heba, Gaben, Shahd, Al-Khatib, Mutaz, Ghaly, Mohammed
This paper evaluates the knowledge and reasoning capabilities of Large Language Models in Islamic inheritance law, known as 'ilm al-mawarith. We assess the performance of seven LLMs using a benchmark of 1,000 multiple-choice questions covering diverse inheritance scenarios, designed to test models' ability to understand the inheritance context and compute the distribution of shares prescribed by Islamic jurisprudence. The results reveal a significant performance gap: o3 and Gemini 2.5 achieved accuracies above 90%, whereas ALLaM, Fanar, LLaMA, and Mistral scored below 50%. These disparities reflect important differences in reasoning ability and domain adaptation. We conduct a detailed error analysis to identify recurring failure patterns across models, including misunderstandings of inheritance scenarios, incorrect application of legal rules, and insufficient domain knowledge. Our findings highlight limitations in handling structured legal reasoning and suggest directions for improving performance in Islamic legal reasoning. Code: https://github.com/bouchekif/inheritance_evaluation
Language Models Identify Ambiguities and Exploit Loopholes
Choi, Jio, Bansal, Mohit, Stengel-Eskin, Elias
Studying the responses of large language models (LLMs) to loopholes presents a two-fold opportunity. First, it affords us a lens through which to examine ambiguity and pragmatics in LLMs, since exploiting a loophole requires identifying ambiguity and performing sophisticated pragmatic reasoning. Second, loopholes pose an interesting and novel alignment problem where the model is presented with conflicting goals and can exploit ambiguities to its own advantage. To address these questions, we design scenarios where LLMs are given a goal and an ambiguous user instruction in conflict with the goal, with scenarios covering scalar implicature, structural ambiguities, and power dynamics. We then measure different models' abilities to exploit loopholes to satisfy their given goals as opposed to the goals of the user. We find that both closed-source and stronger open-source models can identify ambiguities and exploit their resulting loopholes, presenting a potential AI safety risk. Our analysis indicates that models which exploit loopholes explicitly identify and reason about both ambiguity and conflicting goals.
TAI Scan Tool: A RAG-Based Tool With Minimalistic Input for Trustworthy AI Self-Assessment
Davvetas, Athanasios, Ziouvelou, Xenia, Dami, Ypatia, Kaponis, Alexios, Giouvanopoulou, Konstantina, Papademas, Michael
This paper introduces the TAI Scan Tool, a RAG-based TAI self-assessment tool with minimalistic input. The current version of the tool supports the legal TAI assessment, with a particular emphasis on facilitating compliance with the AI Act. It involves a two-step approach with a pre-screening and an assessment phase. The assessment output of the system includes insight regarding the risk-level of the AI system according to the AI Act, while at the same time retrieving relevant articles to aid with compliance and notify on their obligations. Our qualitative evaluation using use-case scenarios yields promising results, correctly predicting risk levels while retrieving relevant articles across three distinct semantic groups. Furthermore, interpretation of results shows that the tool's reasoning relies on comparison with the setting of high-risk systems, a behaviour attributed to their deployment requiring careful consideration, and therefore frequently presented within the AI Act.