AITopics

2502.21034

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Colorado (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Promising Solution (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Law > Statutes (0.67)
Information Technology > Services > e-Commerce Services (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Bansal, Naman, mahajan, Yash, Sinha, Sanjeev, Karmaker, Santu

Set-Theoretic Compositionality of Sentence Embeddings

arXiv.org Artificial IntelligenceFeb-28-2025

Sentence encoders play a pivotal role in various NLP tasks; hence, an accurate evaluation of their compositional properties is paramount. However, existing evaluation methods predominantly focus on goal task-specific performance. This leaves a significant gap in understanding how well sentence embeddings demonstrate fundamental compositional properties in a task-independent context. Leveraging classical set theory, we address this gap by proposing six criteria based on three core "set-like" compositions/operations: \textit{TextOverlap}, \textit{TextDifference}, and \textit{TextUnion}. We systematically evaluate $7$ classical and $9$ Large Language Model (LLM)-based sentence encoders to assess their alignment with these criteria. Our findings show that SBERT consistently demonstrates set-like compositional properties, surpassing even the latest LLMs. Additionally, we introduce a new dataset of ~$192$K samples designed to facilitate future benchmarking efforts on set-like compositionality of sentence embeddings.

computational linguistic, input sentence, projection, (14 more...)

2502.20975

Country:

Asia > Middle East > Iran (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
(17 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Government (0.96)
Health & Medicine (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceFeb-28-2025

Efficient Jailbreaking of Large Models by Freeze Training: Lower Layers Exhibit Greater Sensitivity to Harmful Content

Shen, Hongyuan, Zheng, Min, Wang, Jincheng, Zhao, Yang

With the widespread application of Large Language Models across various domains, their security issues have increasingly garnered significant attention from both academic and industrial communities. This study conducts sampling and normalization of the parameters of the LLM to generate visual representations and heatmaps of parameter distributions, revealing notable discrepancies in parameter distributions among certain layers within the hidden layers. Further analysis involves calculating statistical metrics for each layer, followed by the computation of a Comprehensive Sensitivity Score based on these metrics, which identifies the lower layers as being particularly sensitive to the generation of harmful content. Based on this finding, we employ a Freeze training strategy, selectively performing Supervised Fine-Tuning only on the lower layers. Experimental results demonstrate that this method significantly reduces training duration and GPU memory consumption while maintaining a high jailbreak success rate and a high harm score, outperforming the results achieved by applying the LoRA method for SFT across all layers. Additionally, the method has been successfully extended to other open-source large models, validating its generality and effectiveness across different model architectures. Furthermore, we compare our method with ohter jailbreak method, demonstrating the superior performance of our approach. By innovatively proposing a method to statistically analyze and compare large model parameters layer by layer, this study provides new insights into the interpretability of large models. These discoveries emphasize the necessity of continuous research and the implementation of adaptive security measures in the rapidly evolving field of LLMs to prevent potential jailbreak attack risks, thereby promoting the development of more robust and secure LLMs.

arxiv preprint arxiv, harm score, harmful content, (13 more...)

2502.20952

Country:

North America > United States (0.46)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.89)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Di Gennaro, Federico, Laugel, Thibault, Grari, Vincent, Detyniecki, Marcin

Controlled Model Debiasing through Minimal and Interpretable Updates

arXiv.org Machine LearningFeb-28-2025

Traditional approaches to learning fair machine learning models often require rebuilding models from scratch, generally without accounting for potentially existing previous models. In a context where models need to be retrained frequently, this can lead to inconsistent model updates, as well as redundant and costly validation testing. To address this limitation, we introduce the notion of controlled model debiasing, a novel supervised learning task relying on two desiderata: that the differences between new fair model and the existing one should be (i) interpretable and (ii) minimal. After providing theoretical guarantees to this new problem, we introduce a novel algorithm for algorithmic fairness, COMMOD, that is both model-agnostic and does not require the sensitive attribute at test time. In addition, our algorithm is explicitly designed to enforce (i) minimal and (ii) interpretable changes between biased and debiased predictions--a property that, while highly desirable in high-stakes applications, is rarely prioritized as an explicit objective in fairness literature. Our approach combines a concept-based architecture and adversarial learning and we demonstrate through empirical results that it achieves comparable performance to state-of-the-art debiasing methods while performing minimal and interpretable prediction changes. 1 Introduction The increasing adoption of machine learning models in high-stakes domains--such as criminal justice (Klein-berg et al., 2016) and credit lending (Bruckner, 2018)--has raised significant concerns about the potential biases that these models may reproduce and amplify, particularly against historically marginalized groups. Recent public discourse, along with regulatory developments such as the European AI Act (2024/1689), has further underscored the need for adapting AI systems to ensure fairness and trustworthiness (Bringas Col-menarejo et al., 2022). Consequently, many of the machine learning models deployed by organizations are, or may soon be, subject to these emerging regulatory requirements. Yet, such organizations frequently invest significant resources (e.g. The field of algorithmic fairness has experienced rapid growth in recent years, with numerous bias mitigation strategies proposed (Romei & Ruggieri, 2014; Mehrabi et al., 2021). These approaches can be broadly categorized into three types: pre-processing (e.g.,(Belrose et al., 2024)), in-processing (e.g.,(Zhang et al., 2018)), and post-processing(e.g., (Kamiran et al., 2010)), based on the stage of the machine learning pipeline at which fairness is enforced. While the two former categories do not account at all for any pre-existing biased model being available for the task, post-processing approaches aim to impose fairness by directly modifying the predictions of a biased classifier.

accuracy, commod, fairness, (14 more...)

arXiv.org Machine Learning

2502.21284

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Chandrasekhar, Achuth, Farimani, Omid Barati, Ajenifujah, Olabode T., Ock, Janghoon, Farimani, Amir Barati

NANOGPT: A Query-Driven Large Language Model Retrieval-Augmented Generation System for Nanotechnology Research

This paper presents the development and application of a Large Language Model Retrieval-Augmented Generation (LLM-RAG) system tailored for nanotechnology research. The system leverages the capabilities of a sophisticated language model to serve as an intelligent research assistant, enhancing the efficiency and comprehensiveness of literature reviews in the nanotechnology domain. Central to this LLM-RAG system is its advanced query backend retrieval mechanism, which integrates data from multiple reputable sources. The system retrieves relevant literature by utilizing Google Scholar's advanced search, and scraping open-access papers from Elsevier, Springer Nature, and ACS Publications. This multifaceted approach ensures a broad and diverse collection of up-to-date scholarly articles and papers. The proposed system demonstrates significant potential in aiding researchers by providing a streamlined, accurate, and exhaustive literature retrieval process, thereby accelerating research advancements in nanotechnology. The effectiveness of the LLM-RAG system is validated through rigorous testing, illustrating its capability to significantly reduce the time and effort required for comprehensive literature reviews, while maintaining high accuracy, query relevance and outperforming standard, publicly available LLMS.

large language model, machine learning, natural language, (18 more...)

2502.20541

Country:

Europe (0.28)
Asia > China (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report (1.00)
Overview > Innovation (0.45)

Industry:

Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Oil & Gas > Upstream (1.00)
(9 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

de Cerqueira, José Siqueira, Kemell, Kai-Kristian, Rousi, Rebekah, Xi, Nannan, Hamari, Juho, Abrahamsson, Pekka

Mapping Trustworthiness in Large Language Models: A Bibliometric Analysis Bridging Theory to Practice

The rapid proliferation of Large Language Models (LLMs) has raised pressing concerns regarding their trustworthiness, spanning issues of reliability, transparency, fairness, and ethical alignment. Despite the increasing adoption of LLMs across various domains, there remains a lack of consensus on how to operationalize trustworthiness in practice. This study bridges the gap between theoretical discussions and implementation by conducting a bibliometric mapping analysis of 2,006 publications from 2019 to 2025. Through co-authorship networks, keyword co-occurrence analysis, and thematic evolution tracking, we identify key research trends, influential authors, and prevailing definitions of LLM trustworthiness. Additionally, a systematic review of 68 core papers is conducted to examine conceptualizations of trust and their practical implications. Our findings reveal that trustworthiness in LLMs is often framed through existing organizational trust frameworks, emphasizing dimensions such as ability, benevolence, and integrity. However, a significant gap exists in translating these principles into concrete development strategies. To address this, we propose a structured mapping of 20 trust-enhancing techniques across the LLM lifecycle, including retrieval-augmented generation (RAG), explainability techniques, and post-training audits. By synthesizing bibliometric insights with practical strategies, this study contributes towards fostering more transparent, accountable, and ethically aligned LLMs, ensuring their responsible deployment in real-world applications.

bibliometric analysis bridging theory, language model, trustworthiness, (11 more...)

2503.04785

Country:

Europe > France (0.14)
Europe > Finland > Pirkanmaa > Tampere (0.05)
Asia > China (0.04)
(13 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)
Law (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

LLM-Empowered Class Imbalanced Graph Prompt Learning for Online Drug Trafficking Detection

Ma, Tianyi, Qian, Yiyue, Wang, Zehong, Zhang, Zheyuan, Zhang, Chuxu, Ye, Yanfang

As the market for illicit drugs remains extremely profitable, major online platforms have become direct-to-consumer intermediaries for illicit drug trafficking participants. These online activities raise significant social concerns that require immediate actions. Existing approaches to combating this challenge are generally impractical, due to the imbalance of classes and scarcity of labeled samples in real-world applications. To this end, we propose a novel Large Language Model-empowered Heterogeneous Graph Prompt Learning framework for illicit Drug Trafficking detection, called LLM-HetGDT, that leverages LLM to facilitate heterogeneous graph neural networks (HGNNs) to effectively identify drug trafficking activities in the class-imbalanced scenarios. Specifically, we first pre-train HGNN over a contrastive pretext task to capture the inherent node and structure information over the unlabeled drug trafficking heterogeneous graph (HG). Afterward, we employ LLM to augment the HG by generating high-quality synthetic user nodes in minority classes. Then, we fine-tune the soft prompts on the augmented HG to capture the important information in the minority classes for the downstream drug trafficking detection task. To comprehensively study online illicit drug trafficking activities, we collect a new HG dataset over Twitter, called Twitter-HetDrug. Extensive experiments on this dataset demonstrate the effectiveness, efficiency, and applicability of LLM-HetGDT.

drug trafficking participant, information, node, (10 more...)

2503.019

Country:

North America > United States > Connecticut (0.04)
North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)

Genre: Research Report (0.64)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Shimanuki, Gabriel Kenji Godoy, Nascimento, Alexandre Moreira, Vismari, Lucio Flavio, Junior, Joao Batista Camargo, Junior, Jorge Rady de Almeida, Cugnasca, Paulo Sergio

Navigating the Edge with the State-of-the-Art Insights into Corner Case Identification and Generation for Enhanced Autonomous Vehicle Safety

In recent years, there has been significant development of autonomous vehicle (AV) technologies. However, despite the notable achievements of some industry players, a strong and appealing body of evidence that demonstrate AVs are actually safe is lacky, which could foster public distrust in this technology and further compromise the entire development of this industry, as well as related social impacts. To improve the safety of AVs, several techniques are proposed that use synthetic data in virtual simulation. In particular, the highest risk data, known as corner cases (CCs), are the most valuable for developing and testing AV controls, as they can expose and improve the weaknesses of these autonomous systems. In this context, the present paper presents a systematic literature review aiming to comprehensively analyze methodologies for CC identifi cation and generation, also pointing out current gaps and further implications of synthetic data for AV safety and reliability. Based on a selection criteria, 110 studies were picked from an initial sample of 1673 papers. These selected paper were mapped into multiple categories to answer eight inter-linked research questions. It concludes with the recommendation of a more integrated approach focused on safe development among all stakeholders, with active collaboration between industry, academia and regulatory bodies.

autonomous vehicle, international conference, scenario, (15 more...)

2503.00077

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(10 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.66)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Leisure & Entertainment (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Societal Alignment Frameworks Can Improve LLM Alignment

Stańczak, Karolina, Meade, Nicholas, Bhatia, Mehar, Zhou, Hattie, Böttinger, Konstantin, Barnes, Jeremy, Stanley, Jason, Montgomery, Jessica, Zemel, Richard, Papernot, Nicolas, Chapados, Nicolas, Therien, Denis, Lillicrap, Timothy P., Marasović, Ana, Delacroix, Sylvie, Hadfield, Gillian K., Reddy, Siva

Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared values - a process coined alignment. However, aligning LLMs remains challenging due to the inherent disconnect between the complexity of human values and the narrow nature of the technological approaches designed to address them. Current alignment methods often lead to misspecified objectives, reflecting the broader issue of incomplete contracts, the impracticality of specifying a contract between a model developer, and the model that accounts for every scenario in LLM alignment. In this paper, we argue that improving LLM alignment requires incorporating insights from societal alignment frameworks, including social, economic, and contractual alignment, and discuss potential solutions drawn from these domains. Given the role of uncertainty within societal alignment frameworks, we then investigate how it manifests in LLM alignment. We end our discussion by offering an alternative view on LLM alignment, framing the underspecified nature of its objectives as an opportunity rather than perfect their specification. Beyond technical improvements in LLM alignment, we discuss the need for participatory alignment interface designs.

alignment, llm alignment, societal alignment framework, (11 more...)

2503.00069

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(15 more...)

Genre:

Research Report (0.71)
Overview (0.46)
Instructional Material (0.46)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

LexRAG: Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversation

Li, Haitao, Chen, Yifan, Hu, Yiran, Ai, Qingyao, Chen, Junjie, Yang, Xiaoyu, Yang, Jianhui, Wu, Yueyue, Liu, Zeyang, Liu, Yiqun

Retrieval-augmented generation (RAG) has proven highly effective in improving large language models (LLMs) across various domains. However, there is no benchmark specifically designed to assess the effectiveness of RAG in the legal domain, which restricts progress in this area. To fill this gap, we propose LexRAG, the first benchmark to evaluate RAG systems for multi-turn legal consultations. LexRAG consists of 1,013 multi-turn dialogue samples and 17,228 candidate legal articles. Each sample is annotated by legal experts and consists of five rounds of progressive questioning. LexRAG includes two key tasks: (1) Conversational knowledge retrieval, requiring accurate retrieval of relevant legal articles based on multi-turn context. (2) Response generation, focusing on producing legally sound answers. To ensure reliable reproducibility, we develop LexiT, a legal RAG toolkit that provides a comprehensive implementation of RAG system components tailored for the legal domain. Additionally, we introduce an LLM-as-a-judge evaluation pipeline to enable detailed and effective assessment. Through experimental analysis of various LLMs and retrieval methods, we reveal the key limitations of existing RAG systems in handling legal consultation conversations. LexRAG establishes a new benchmark for the practical application of RAG systems in the legal domain, with its code and data available at https://github.com/CSHaitao/LexRAG.

arxiv preprint arxiv, legal domain, lexrag, (13 more...)

2502.2064

Country:

North America > United States > District of Columbia > Washington (0.05)
Asia > China > Beijing > Beijing (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre:

Research Report (0.82)
Overview (0.68)

Industry:

Law (1.00)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)