Goto

Collaborating Authors

 Law


Veri-R1: Toward Precise and Faithful Claim Verification via Online Reinforcement Learning

arXiv.org Artificial Intelligence

Claim verification with large language models (LLMs) has recently attracted growing attention, due to their strong reasoning capabilities and transparent verification processes compared to traditional answer-only judgments. However, existing approaches to online claim verification, which requires iterative evidence retrieval and reasoning, still mainly rely on prompt engineering or pre-designed reasoning workflows, without unified training to improve necessary skills. Therefore, we introduce Veri-R1, an online reinforcement learning (RL) framework that enables an LLM to interact with a search engine and to receive reward signals that explicitly shape its planning, retrieval, and reasoning behaviors. This dynamic interaction of LLM with retrieval systems more accurately reflects real-world verification scenarios and fosters comprehensive verification skills. Empirical results show that Veri-R1 improves joint accuracy by up to 30% and doubles the evidence score, often surpassing its larger-scale model counterparts. Ablation studies further reveal the impact of reward components, and the link between output logits and label accuracy. Our results highlight the effectiveness of online RL for precise and faithful claim verification, providing an important foundation for future research. We release our code to support community progress in LLM empowered claim verification.


Artificial Authority: From Machine Minds to Political Alignments. An Experimental Analysis of Democratic and Autocratic Biases in Large-Language Models

arXiv.org Artificial Intelligence

Political beliefs vary significantly across different countries, reflecting distinct historical, cultural, and institutional contexts. These ideologies, ranging from liberal democracies to rigid autocracies, influence human societies, as well as the digital systems that are constructed within those societies. The advent of generative artificial intelligence, particularly Large Language Models (LLMs), introduces new agents in the political space-agents trained on massive corpora that replicate and proliferate socio-political assumptions. This paper analyses whether LLMs display propensities consistent with democratic or autocratic world-views. We validate this insight through experimental tests in which we experiment with the leading LLMs developed across disparate political contexts, using several existing psychometric and political orientation measures. The analysis is based on both numerical scoring and qualitative analysis of the models' responses. Findings indicate high model-to-model variability and a strong association with the political culture of the country in which the model was developed. These findings highlight the need for more detailed examination of the socio-political dimensions embedded within AI systems.


Active Attacks: Red-teaming LLMs via Adaptive Environments

arXiv.org Artificial Intelligence

We address the challenge of generating diverse attack prompts for large language models (LLMs) that elicit harmful behaviors (e.g., insults, sexual content) and are used for safety fine-tuning. Rather than relying on manual prompt engineering, attacker LLMs can be trained with reinforcement learning (RL) to automatically generate such prompts using only a toxicity classifier as a reward. However, capturing a wide range of harmful behaviors is a significant challenge that requires explicit diversity objectives. Existing diversity-seeking RL methods often collapse to limited modes: once high-reward prompts are found, exploration of new regions is discouraged. Inspired by the active learning paradigm that encourages adaptive exploration, we introduce \textit{Active Attacks}, a novel RL-based red-teaming algorithm that adapts its attacks as the victim evolves. By periodically safety fine-tuning the victim LLM with collected attack prompts, rewards in exploited regions diminish, which forces the attacker to seek unexplored vulnerabilities. This process naturally induces an easy-to-hard exploration curriculum, where the attacker progresses beyond easy modes toward increasingly difficult ones. As a result, Active Attacks uncovers a wide range of local attack modes step by step, and their combination achieves wide coverage of the multi-mode distribution. Active Attacks, a simple plug-and-play module that seamlessly integrates into existing RL objectives, unexpectedly outperformed prior RL-based methods -- including GFlowNets, PPO, and REINFORCE -- by improving cross-attack success rates against GFlowNets, the previous state-of-the-art, from 0.07% to 31.28% (a relative gain greater than $400\ \times$) with only a 6% increase in computation. Our code is publicly available \href{https://github.com/dbsxodud-11/active_attacks}{here}.


Rethinking the Role of Text Complexity in Language Model Pretraining

arXiv.org Artificial Intelligence

Improving pretraining data quality and size is known to boost downstream performance, but the role of text complexity--how hard a text is to read--remains less explored. We reduce surface-level complexity (shorter sentences, simpler words, simpler structure) while keeping core content approximately constant and ask: (i) How does complexity affect language modeling across model sizes? (ii) Can useful representations be learned from simpler text alone? (iii) How does pretraining text complexity influence downstream language understanding? We simplify human-written texts using a large language model, pretrain causal models (28M-500M) from scratch on original vs. simplified data, and evaluate them in fine-tuning and zero-shot setups. We find that perplexity is sensitive to the interaction between model capacity and text complexity--smaller models degrade far less on simpler texts--while text complexity has little impact on fine-tuning evaluations, with zero-shot evaluations indicating that simpler texts benefit performance on linguistic knowledge tasks, whereas more complex texts favor tasks requiring world knowledge and entity tracking. Our findings suggest that different types of data diversity affect transfer and zero-shot performance differently, providing insight into tailoring data curation to specific goals.


Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Deliberation

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly applied in diverse real-world scenarios, each governed by bespoke behavioral and safety specifications (spec) custom-tailored by users or organizations. These spec, categorized into safety-spec and behavioral-spec, vary across scenarios and evolve with changing preferences and requirements. We formalize this challenge as specification alignment, focusing on LLMs' ability to follow dynamic, scenario-specific spec from both behavioral and safety perspectives. To address this challenge, we propose Align3, a lightweight method that employs Test-Time Deliberation (TTD) with hierarchical reflection and revision to reason over the specification boundaries. We further present SpecBench, a unified benchmark for measuring specification alignment, covering 5 scenarios, 103 spec, and 1,500 prompts. Experiments on 15 reasoning and 18 instruct models with several TTD methods, including Self-Refine, TPO, and MoreThink, yield three key findings: (i) test-time deliberation enhances specification alignment; (ii) Align3 advances the safety-helpfulness trade-off frontier with minimal overhead; (iii) SpecBench effectively reveals alignment gaps. These results highlight the potential of test-time deliberation as an effective strategy for reasoning over the real-world specification boundaries.


GAMA: A General Anonymizing Multi-Agent System for Privacy Preservation Enhanced by Domain Rules and Disproof Mechanism

arXiv.org Artificial Intelligence

With the rapid advancement of Large Language Models (LLMs), LLM-based agents exhibit exceptional abilities in understanding and generating natural language, enabling human-like collaboration and information transmission in LLM-based Multi-Agent Systems (MAS). High-performance LLMs are often hosted on web servers in public cloud environments. When tasks involve private data, MAS cannot securely utilize these LLMs without implementing the agentic privacy-preserving mechanism. To address this challenge, we propose a General Anonymizing Multi-Agent System (GAMA), which divides the agents' workspace into private and public spaces, ensuring privacy through a structured anonymization mechanism. In the private space, agents handle sensitive data, while in the public web space, only anonymized data is utilized. GAMA incorporates two key modules to mitigate semantic loss caused by anonymization: Domain-Rule-based Knowledge Enhancement (DRKE) and Disproof-based Logic Enhancement (DLE). We evaluate GAMA on two general question-answering datasets, a public privacy leakage benchmark, and two customized question-answering datasets related to privacy. The results demonstrate that GAMA outperforms existing baselines on the evaluated datasets in terms of both task accuracy and privacy preservation metrics.


New Supreme Court term will reshape Trump's powers

BBC News

New Supreme Court term will reshape Trump's powers The US Supreme Court begins its new term on Monday with a docket already full of potentially significant cases that could define the scope of Donald Trump's presidential authority - and the prospect of more to come. In the eight months that Trump has been back in the White House, he has tested the limits of executive power, unilaterally implementing new policies, slashing federal budgets and workforce, and attempting to bring previously independent agencies and institutions more directly under his control. The latest brewing legal battle comes from the president's attempts to take control of state National Guard units and deploy them in cities where he claims there is public unrest and rampant crime - over the objection of local and state officials. In Oregon, a federal judge has issued orders blocking Trump's deployment of troops to Portland. An appeals court is set to review the move in the coming days.



I've seen AI try to ESCAPE labs. The apocalypse is already here... and our children will be the first victims

Daily Mail - Science & tech

America's richest real estate tycoon disowns son with shockingly icy 12-word statement after'man cave' plans went terribly wrong Horrific stab wounds suffered by grease truck driver, 69, 'stabbed by Mark Sanchez' with NFL star facing up to six years in prison Taylor Swift makes surprise confession on her song'about ex Joe Alwyn' as she insists fans have'always had the wrong idea' about it Sinister notes that are plaguing remote county explodes as fears mount over creepy messages: 'What else could they do?' Key North Atlantic current is on the brink of COLLAPSING - plunging Europe into a'Little Ice Age', scientists warn Visionary billionaire died in a suspicious house fire. Then a mysterious will emerged... CBS staff in panic as anti-woke firebrand Bari Weiss takes control with no-nonsense show on America's most divisive issues Trump's war room plots savage bloodbath as countdown enters final hours: Live updates Trump sends Navy officers wild with powerful message to liberals claiming he's'unwell' We got hopelessly hooked on a trendy'wellness' tonic. We thought it was harmless but our descent into addiction left us depressed, in debt... and in rehab Judge speaks out after her $1.5m mansion'exploded' in suspected arson attack after she defied Trump order Mark Sanchez's alleged victim's family breaks silence as grim photos emerge after violent attack So many women suffer bloated, uncomfortable guts, says DR EMILY LEEMING. Here's the 7 simple cures I give my patients - you won't have read these before My son made a horrifying accusation about me in therapy... it's destroyed our relationship: DEAR JANE Ex-NFL star Mark Sanchez'thought he'd been shot and pounded on window of pub to get help', bartender reveals Nicole Kidman's friends tear into Keith Urban over bombshell split: 'Total 180 on who he is' Real Housewives of Atlanta vet Porsha Williams reveals she is dating a woman... after ex Simon was deported by ICE US billionaire retail estate tycoon is ordered to sell off his'exceptional' ยฃ36million London mansion in bitter divorce battle with ex-wife My husband works in Dubai and has cheated on me at least three times so far.


How AI Is Changing White-Collar Work

TIME - Tech

Booth is a reporter at TIME. Booth is a reporter at TIME. Julian Pintat, a freelance English-to-German translator has watched his 15-year career gradually unravel. Specializing in high-stakes fields like medical technology and pharmaceutics, his expertise has been repriced as an AI cleanup service. Fixing such basic flaws, which now constitutes 95% of his work, often takes longer than translating from scratch, he says--a frustrating reality that has halved his income and put life plans including marriage and starting a family on indefinite hold.