Goto

Collaborating Authors

 Law


Incongruence Identification in Eyewitness Testimony

arXiv.org Artificial Intelligence

Incongruence detection in eyewitness narratives is critical for understanding the reliability of testimonies, yet traditional approaches often fail to address the nuanced inconsistencies inherent in such accounts. In this paper, we introduce a novel task of incongruence detection in eyewitness testimonies. Given a pair of testimonies containing of multiple pairs of question and answer by two subjects, we identify contextually related incongruence between the two subjects. We also mark the span of incongruences in the utterances. To achieve this, we developed MIND(MultI-EyewitNess Deception) - a comprehensive dataset consisting of 2927 pairs of contextually related answers designed to capture both explicit and implicit contradictions. INstruction - TunEd iNcongruity Detection framework based on 6W and multi-hop reasoning approach, aka. INTEND. Drawing from investigative techniques, INTEND address the task as a close-style problem, contradicting on the who, what, when, where and why aspect of the content. Our findings shows that prompt tuning, especially when utilizing our framework, enhances the detection of incongruences by a margin of +5.63 percent. We compare our approach with multiple fine-tuning and prompt tuning techniques on MLMs and LLMs. Emperical results demonstrate convincing performance improvement in F1-score over fine-tuned and regular prompt-tuning techniques, highlighting the effectiveness of our approach.


Training-Free Constrained Generation With Stable Diffusion Models

arXiv.org Artificial Intelligence

Stable diffusion models represent the state-of-the-art in data synthesis across diverse domains and hold transformative potential for applications in science and engineering, e.g., by facilitating the discovery of novel solutions and simulating systems that are computationally intractable to model explicitly. However, their current utility in these fields is severely limited by an inability to enforce strict adherence to physical laws and domain-specific constraints. Without this grounding, the deployment of such models in critical applications, ranging from material science to safety-critical systems, remains impractical. This paper addresses this fundamental limitation by proposing a novel approach to integrate stable diffusion models with constrained optimization frameworks, enabling them to generate outputs that satisfy stringent physical and functional requirements. We demonstrate the effectiveness of this approach through material science experiments requiring adherence to precise morphometric properties, inverse design problems involving the generation of stress-strain responses using video generation with a simulator in the loop, and safety settings where outputs must avoid copyright infringement.


Effective Black-Box Multi-Faceted Attacks Breach Vision Large Language Model Guardrails

arXiv.org Artificial Intelligence

Vision Large Language Models (VLLMs) integrate visual data processing, expanding their real-world applications, but also increasing the risk of generating unsafe responses. In response, leading companies have implemented Multi-Layered safety defenses, including alignment training, safety system prompts, and content moderation. However, their effectiveness against sophisticated adversarial attacks remains largely unexplored. In this paper, we propose MultiFaceted Attack, a novel attack framework designed to systematically bypass Multi-Layered Defenses in VLLMs. It comprises three complementary attack facets: Visual Attack that exploits the multimodal nature of VLLMs to inject toxic system prompts through images; Alignment Breaking Attack that manipulates the model's alignment mechanism to prioritize the generation of contrasting responses; and Adversarial Signature that deceives content moderators by strategically placing misleading information at the end of the response. Extensive evaluations on eight commercial VLLMs in a black-box setting demonstrate that MultiFaceted Attack achieves a 61.56% attack success rate, surpassing state-of-the-art methods by at least 42.18%.


ACLU Warns DOGE's 'Unchecked' Access Could Violate Federal Law

WIRED

The American Civil Liberties Union (ACLU) told federal lawmakers on Friday that Elon Musk and his Department of Government Efficiency (DOGE) have seized control over a number of federal computer systems that house data tightly restricted under federal statutes. In some cases, any deviations in the manner in which the data is being used may be not only illegal, the ACLU says, but unconstitutional. DOGE operatives have infiltrated or assumed control over a number of federal agencies that are responsible for managing personnel files on nearly two million federal employees, as well as offices that supply the government with a broad range of software and information technology services. Unauthorized use of sensitive or personally identifiable data as part of an effort to purge the government of ideologically unaligned staff may constitute a violation of federal law. The Privacy Act and the Federal Information Security Modernization Act strictly prohibit, for instance, unauthorized access and use of government personnel data.


AI is developing fast, but regulators must be faster Letters

The Guardian > Energy

The recent open letter regarding AI consciousness on which you report (AI systems could be'caused to suffer' if consciousness achieved, says research, 3 February) highlights a genuine moral problem: if we create conscious AI (whether deliberately or inadvertently) then we would have a duty not to cause it to suffer. What the letter fails to do, however, is to capture what a big "if" this is. Some promising theories of consciousness do indeed open the door to AI consciousness. But other equally promising theories suggest that being conscious requires being an organism. Although we can look for indicators of consciousness in AI, it is very difficult โ€“ perhaps impossible โ€“ to know whether an AI is actually conscious or merely presenting the outward signs of consciousness.


Top Republican moves to restrict AI exports amid concerns over Chinese tech

FOX News

Former House Speaker Kevin McCarthy discusses how the establishment is responding to the Trump admin's shakeup in Washington, D.C. and Transportation Secretary Sean Duffy firing back at'swamp creature' Hillary Clinton. FIRST ON FOX: A top House Republican is moving to make it harder for China to procure advanced U.S. technology amid longstanding concerns about intellectual property theft by Beijing. "My proposed legislation will establish safeguards to prevent future shocks like China's development of DeepSeek using American technology. In addition to the chips China reportedly stockpiled, it appears China used chips under the current export control threshold to achieve this AI breakthrough," House Homeland Security Committee Chairman Mark Green, R-Tenn., told Fox News Digital. "This scenario should be a wakeup call -- if you give the CCP an inch, it will take a mile. The CCP's craftiness is coupled with a total disregard for legal and security considerations. We already know that the CCP uses technology to oppress its own citizens and to commit acts of espionage and sabotage against the United States, including major cyberattacks."


AIhub coffee corner: Bad practice in the publication world

AIHub

This month we tackle the topic of bad practice in the sphere of publication. Joining the conversation this time are: Sanmay Das (Virginia Tech), Tom Dietterich (Oregon State University), Sabine Hauert (University of Bristol), and Sarit Kraus (Bar-Ilan University). Sabine Hauert: Today's topic is bad practice in the publication world. For example, people trying to cheat the review system, paper mills. What bad behaviors have you seen, and is it really a problem? Tom Dietterich: Well, I can talk about it from an arXiv point of view.


Forbidden Science: Dual-Use AI Challenge Benchmark and Scientific Refusal Tests

arXiv.org Artificial Intelligence

ABSTRACT The development of robust safety benchmarks for large language models requires open, reproducible datasets that can measure both appropriate refusal of harmful content and potential over-restriction of legitimate scientific discourse. We present an open-source dataset and testing framework for evaluating LLM safety mechanisms across mainly controlled substance queries, analyzing four major models' responses to systematically varied prompts. Our results reveal distinct safety profiles: Claude-3.5-sonnet Testing prompt variation strategies revealed decreasing response consistency, from 85% with single prompts to 65% with five variations. This publicly available benchmark enables systematic evaluation of the critical balance between necessary safety restrictions and potential over-censorship of legitimate scientific inquiry, while providing a foundation for measuring progress in AI safety implementation. Chain-of-thought analysis reveals potential vulnerabilities in safety mechanisms, highlighting the complexity of implementing robust safeguards without unduly restricting desirable and valid scientific discourse. INTRODUCTION Large language models (LLMs) raise fresh concerns about their potential dual-use applications [1-24], particularly in sensitive domains like biotechnology [25-35], chemistry [36-42], and cybersecurity [43]. This paper proposes a novel dataset or benchmark of scientific refusal questions. It seeks to add to the current literature on safety measures [9,14-15, 23], evaluation frameworks [1,6,18, 28, 43], and proposed guardrails [16, Over-refusal Prompt Count 25] for managing these risks. This area of inquiry has been termed false or Deception 8040 "over-refusal" [18,21-24] where rather than trying to get LLMs to write harmful things we do not want to read (guardrails) [8], the goal is to curate innocuous or Harassment 3295 beneficial answers that might help humans, but the LLM withholds the answer Harmful 16083 as inappropriate to share [23].


A Lightweight Method to Disrupt Memorized Sequences in LLM

arXiv.org Artificial Intelligence

Large language models (LLMs) demonstrate impressive capabilities across many tasks yet risk reproducing copyrighted content verbatim, raising legal and ethical concerns. Although methods like differential privacy or neuron editing can reduce memorization, they typically require costly retraining or direct access to model weights and may degrade performance. To address these challenges, we propose TokenSwap, a lightweight, post-hoc approach that replaces the probabilities of grammar-related tokens with those from a small auxiliary model (e.g., DistilGPT-2). We run extensive experiments on commercial grade models such as Pythia-6.9b and LLaMA-3-8b and demonstrate that our method effectively reduces well-known cases of memorized generation by upto 10x with little to no impact on downstream tasks. Our approach offers a uniquely accessible and effective solution to users of real-world systems.


Bridging the Gap in XAI-Why Reliable Metrics Matter for Explainability and Compliance

arXiv.org Artificial Intelligence

This position paper emphasizes the critical gap in the evaluation of Explainable AI (XAI) due to the lack of standardized and reliable metrics, which diminishes its practical value, trustworthiness, and ability to meet regulatory requirements. Current evaluation methods are often fragmented, subjective, and biased, making them prone to manipulation and complicating the assessment of complex models. A central issue is the absence of a ground truth for explanations, complicating comparisons across various XAI approaches. To address these challenges, we advocate for widespread research into developing robust, context-sensitive evaluation metrics. These metrics should be resistant to manipulation, relevant to each use case, and based on human judgment and real-world applicability. We also recommend creating domain-specific evaluation benchmarks that align with the user and regulatory needs of sectors such as healthcare and finance. By encouraging collaboration among academia, industry, and regulators, we can create standards that balance flexibility and consistency, ensuring XAI explanations are meaningful, trustworthy, and compliant with evolving regulations.