Goto

Collaborating Authors

 Overview


NeuronTune: Fine-Grained Neuron Modulation for Balanced Safety-Utility Alignment in LLMs

arXiv.org Artificial Intelligence

Ensuring robust safety alignment while preserving utility is critical for the reliable deployment of Large Language Models (LLMs). However, current techniques fundamentally suffer from intertwined deficiencies: insufficient robustness against malicious attacks, frequent refusal of benign queries, degradation in generated text quality and general task performance--the former two reflecting deficits in robust safety and the latter constituting utility impairment . We trace these limitations to the coarse-grained layer-wise interventions in existing methods. To resolve this, we propose NeuronT une, a fine-grained framework that dynamically modulates sparse neurons to achieve simultaneous safety-utility optimization. Our approach first identifies safety-critical and utility-preserving neurons across all layers via attribution, then employs meta-learning to adaptively amplify safety-neuron activations and suppress utility-neuron activations. Crucially, NeuronTune enables tunable adjustment of intervention scope via neuron-count thresholds, supporting flexible adaptation to security-critical or utility-priority scenarios. Extensive experimental results demonstrate that our method significantly outperforms existing state-of-the-art technologies, achieving superior model safety while maintaining excellent utility.


From Charts to Fair Narratives: Uncovering and Mitigating Geo-Economic Biases in Chart-to-Text

arXiv.org Artificial Intelligence

Charts are very common for exploring data and communicating insights, but extracting key takeaways from charts and articulating them in natural language can be challenging. The chart-to-text task aims to automate this process by generating textual summaries of charts. While with the rapid advancement of large Vision-Language Models (VLMs), we have witnessed great progress in this domain, little to no attention has been given to potential biases in their outputs. This paper investigates how VLMs can amplify geo-economic biases when generating chart summaries, potentially causing societal harm. Specifically, we conduct a large-scale evaluation of geo-economic biases in VLM-generated chart summaries across 6,000 chart-country pairs from six widely used proprietary and open-source models to understand how a country's economic status influences the sentiment of generated summaries. Our analysis reveals that existing VLMs tend to produce more positive descriptions for high-income countries compared to middle- or low-income countries, even when country attribution is the only variable changed. We also find that models such as GPT-4o-mini, Gemini-1.5-Flash, and Phi-3.5 exhibit varying degrees of bias. We further explore inference-time prompt-based debiasing techniques using positive distractors but find them only partially effective, underscoring the complexity of the issue and the need for more robust debiasing strategies. Our code and dataset are publicly available here.


TPTP World Infrastructure for Non-classical Logics

arXiv.org Artificial Intelligence

The TPTP World is the well established infrastructure that supports research, development, and deployment of Automated Theorem Proving (ATP) systems. The TPTP World supports a range of classical logics, and since release v9.0.0 has supported non-classical logics. This paper provides a self-contained comprehensive overview of the TPTP World infrastructure for ATP in non-classical logics: the non-classical language extension, problems and solutions, and tool support. A detailed description of use of the infrastructure for quantified normal multi-modal logic is given.


The Othello AI Arena: Evaluating Intelligent Systems Through Limited-Time Adaptation to Unseen Boards

arXiv.org Artificial Intelligence

The ability to rapidly adapt to novel and unforeseen environmental changes is a cornerstone of artificial general intelligence (AGI), yet it remains a critical blind spot in most existing AI benchmarks. Traditional evaluation largely focuses on optimizing performance within fixed environments, failing to assess systems' flexibility and generalization capabilities when faced with even subtle rule or structural modifications. Addressing this gap, I introduce the Othello AI Arena, a novel benchmark framework designed to evaluate intelligent systems based on their capacity for limited-time adaptation to unseen environments. Our platform poses a meta-learning challenge: participants must develop systems that can analyze the specific configuration and rules of a novel Othello board within a strict time limit (60 seconds) and generate a tailored, high-performing strategy for that unique environment. With this, evaluation of the meta-level intelligence can be separated from the task-level strategy performance. The Arena features a diverse set of game stages, including public stages for development and private stages with structural and rule variations designed to test genuine adaptive and generalization capabilities. Implemented as an accessible web-based platform, the Arena provides real-time visualization, automated evaluation using multi-dimensional metrics, and comprehensive logging for post-hoc analysis. Initial observations from pilot tests and preliminary student engagements highlight fascinating patterns in adaptation approaches, ranging from rapid parameter tuning to rudimentary environmental model learning through simulation. The Othello AI Arena offers a unique educational tool and a valuable research benchmark for fostering and evaluating the crucial skill of rapid, intelligent adaptation in AI systems.


Beyond Technocratic XAI: The Who, What & How in Explanation Design

arXiv.org Artificial Intelligence

The field of Explainable AI (XAI) offers a wide range of techniques for making complex models interpretable. Yet, in practice, generating meaningful explanations is a context-dependent task that requires intentional design choices to ensure accessibility and transparency. This paper reframes explanation as a situated design process -- an approach particularly relevant for practitioners involved in building and deploying explainable systems. Drawing on prior research and principles from design thinking, we propose a three-part framework for explanation design in XAI: asking Who needs the explanation, What they need explained, and How that explanation should be delivered. We also emphasize the need for ethical considerations, including risks of epistemic inequality, reinforcing social inequities, and obscuring accountability and governance. By treating explanation as a sociotechnical design process, this framework encourages a context-aware approach to XAI that supports effective communication and the development of ethically responsible explanations.


Breath as a biomarker: A survey of contact and contactless applications and approaches in respiratory monitoring

arXiv.org Artificial Intelligence

Breath analysis has emerged as a critical tool in health monitoring, offering insights into respiratory function, disease detection, and continuous health assessment. While traditional contact-based methods are reliable, they often pose challenges in comfort and practicality, particularly for long-term monitoring. This survey comprehensively examines contact-based and contactless approaches, emphasizing recent advances in machine learning and deep learning techniques applied to breath analysis. Contactless methods, including Wi-Fi Channel State Information and acoustic sensing, are analyzed for their ability to provide accurate, noninvasive respiratory monitoring. We explore a broad range of applications, from single-user respiratory rate detection to multi-user scenarios, user identification, and respiratory disease detection. Furthermore, this survey details essential data preprocessing, feature extraction, and classification techniques, offering comparative insights into machine learning/deep learning models suited to each approach. Key challenges like dataset scarcity, multi-user interference, and data privacy are also discussed, along with emerging trends like Explainable AI, federated learning, transfer learning, and hybrid modeling. By synthesizing current methodologies and identifying open research directions, this survey offers a comprehensive framework to guide future innovations in breath analysis, bridging advanced technological capabilities with practical healthcare applications.


Generative Artificial Intelligence in Medical Imaging: Foundations, Progress, and Clinical Translation

arXiv.org Artificial Intelligence

Generative artificial intelligence (AI) is rapidly transforming medical imaging by enabling capabilities such as data synthesis, image enhancement, modality translation, and spatiotemporal modeling. This review presents a comprehensive and forward-looking synthesis of recent advances in generative modeling including generative adversarial networks (GANs), variational autoencoders (VAEs), diffusion models, and emerging multimodal foundation architectures and evaluates their expanding roles across the clinical imaging continuum. We systematically examine how generative AI contributes to key stages of the imaging workflow, from acquisition and reconstruction to cross-modality synthesis, diagnostic support, and treatment planning. Emphasis is placed on both retrospective and prospective clinical scenarios, where generative models help address longstanding challenges such as data scarcity, standardization, and integration across modalities. To promote rigorous benchmarking and translational readiness, we propose a three-tiered evaluation framework encompassing pixel-level fidelity, feature-level realism, and task-level clinical relevance. We also identify critical obstacles to real-world deployment, including generalization under domain shift, hallucination risk, data privacy concerns, and regulatory hurdles. Finally, we explore the convergence of generative AI with large-scale foundation models, highlighting how this synergy may enable the next generation of scalable, reliable, and clinically integrated imaging systems. By charting technical progress and translational pathways, this review aims to guide future research and foster interdisciplinary collaboration at the intersection of AI, medicine, and biomedical engineering.


Guardians and Offenders: A Survey on Harmful Content Generation and Safety Mitigation of LLM

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have revolutionized content creation across digital platforms, offering unprecedented capabilities in natural language generation and understanding. These models enable beneficial applications such as content generation, question and answering (Q&A), programming, and code reasoning. Meanwhile, they also pose serious risks by inadvertently or intentionally producing toxic, offensive, or biased content. This dual role of LLMs, both as powerful tools for solving real-world problems and as potential sources of harmful language, presents a pressing sociotechnical challenge. In this survey, we systematically review recent studies spanning unintentional toxicity, adversarial jailbreaking attacks, and content moderation techniques. We propose a unified taxonomy of LLM-related harms and defenses, analyze emerging multimodal and LLM-assisted jailbreak strategies, and assess mitigation efforts, including reinforcement learning with human feedback (RLHF), prompt engineering, and safety alignment. Our synthesis highlights the evolving landscape of LLM safety, identifies limitations in current evaluation methodologies, and outlines future research directions to guide the development of robust and ethically aligned language technologies.


Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have delivered impressive results in language understanding, generation, reasoning, and pushes the ability boundary of multimodal models. Transformer models, as the foundation of modern LLMs, offer a strong baseline with excellent scaling properties. However, the traditional transformer architecture requires substantial computations and poses significant obstacles for large-scale training and practical deployment. In this survey, we offer a systematic examination of innovative LLM architectures that address the inherent limitations of transformers and boost the efficiency. Starting from language modeling, this survey covers the background and technical details of linear and sparse sequence modeling methods, efficient full attention variants, sparse mixture-of-experts, hybrid model architectures incorporating the above techniques, and emerging diffusion LLMs. Additionally, we discuss applications of these techniques to other modalities and consider their wider implications for developing scalable, resource-aware foundation models. By grouping recent studies into the above category, this survey presents a blueprint of modern efficient LLM architectures, and we hope this could help motivate future research toward more efficient, versatile AI systems.


Adversarial Robustness in Graph Neural Networks: A Hamiltonian Approach

Neural Information Processing Systems

Graph neural networks (GNNs) are vulnerable to adversarial perturbations, including those that affect both node features and graph topology. This paper investigates GNNs derived from diverse neural flows, concentrating on their connection to various stability notions such as BIBO stability, Lyapunov stability, structural stability, and conservative stability. We argue that Lyapunov stability, despite its common use, does not necessarily ensure adversarial robustness. Inspired by physics principles, we advocate for the use of conservative Hamiltonian neural flows to construct GNNs that are robust to adversarial attacks. The adversarial robustness of different neural flow GNNs is empirically compared on several benchmark datasets under a variety of adversarial attacks.