AITopics

2507.05284

Country:

North America > United States (0.28)
Asia (0.28)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.82)

arXiv.org Artificial IntelligenceJul-9-2025

A Survey on Model Extraction Attacks and Defenses for Large Language Models

Zhao, Kaixiang, Li, Lincan, Ding, Kaize, Gong, Neil Zhenqiang, Zhao, Yue, Dong, Yushun

Model extraction attacks pose significant security threats to deployed language models, potentially compromising intellectual property and user privacy. This survey provides a comprehensive taxonomy of LLM-specific extraction attacks and defenses, categorizing attacks into functionality extraction, training data extraction, and prompt-targeted attacks. We analyze various attack methodologies including API-based knowledge distillation, direct querying, parameter recovery, and prompt stealing techniques that exploit transformer architectures. We then examine defense mechanisms organized into model protection, data privacy protection, and prompt-targeted strategies, evaluating their effectiveness across different deployment scenarios. We propose specialized metrics for evaluating both attack effectiveness and defense performance, addressing the specific challenges of generative language models. Through our analysis, we identify critical limitations in current approaches and propose promising research directions, including integrated attack methodologies and adaptive defense mechanisms that balance security with model utility. This work serves NLP researchers, ML engineers, and security professionals seeking to protect language models in production environments.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

2506.22521

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Tran, April, Bortz, David

Weak Form Scientific Machine Learning: Test Function Construction for System Identification

Weak form Scientific Machine Learning (WSciML) is a recently developed framework for data-driven modeling and scientific discovery. It leverages the weak form of equation error residuals to provide enhanced noise robustness in system identification via convolving model equations with test functions, reformulating the problem to avoid direct differentiation of data. The performance, however, relies on wisely choosing a set of compactly supported test functions. In this work, we mathematically motivate a novel data-driven method for constructing Single-scale-Local reference functions for creating the set of test functions. Our approach numerically approximates the integration error introduced by the quadrature and identifies the support size for which the error is minimal, without requiring access to the model parameter values. Through numerical experiments across various models, noise levels, and temporal resolutions, we demonstrate that the selected supports consistently align with regions of minimal parameter estimation error. We also compare the proposed method against the strategy for constructing Multi-scale-Global (and orthogonal) test functions introduced in our prior work, demonstrating the improved computational efficiency.

artificial intelligence, machine learning, test function, (17 more...)

2507.03206

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
North America > United States > Virginia > Hampton (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre:

Overview (0.67)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

López-Montero, Daniel, Álvarez-Aldana, José L., Morales-Martínez, Alicia, Gil-López, Marta, García, Juan M. Auñón

Reinforcement Learning for Automated Cybersecurity Penetration Testing

This paper aims to provide an innovative machine learning-based solution to automate security testing tasks for web applications, ensuring the correct functioning of all components while reducing project maintenance costs. Reinforcement Learning is proposed to select and prioritize tools and optimize the testing path. The presented approach utilizes a simulated webpage along with its network topology to train the agent. Additionally, the model leverages Geometric Deep Learning to create priors that reduce the search space and improve learning convergence. The validation and testing process was conducted on real-world vulnerable web pages commonly used by human hackers for learning. As a result of this study, a reinforcement learning algorithm was developed that maximizes the number of vulnerabilities found while minimizing the number of steps required

machine learning, reinforcement learning, vulnerability, (16 more...)

2507.02969

Country:

Europe > Spain > Galicia > Madrid (0.05)
Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre:

Research Report (1.00)
Overview (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Mortezapour, Alireza, Vitiello, Giuliana

Human-centered AI with focus on Human-robot interaction (Book chapter)

Modern social robots can be considered the descendants of steam engines from the First Industrial Revolution (IR 1.0) and industrial robotic arms from the Third Industrial Revolution (IR 3.0). As some time has passed since the introduction of these robots during the Fourth Industrial Revolution (IR 4.0), challenges and issues in their interaction with humans have emerged, leading researchers to conclude that, like any other AI-based technology, these robots must also be human-centered to meet the needs of their users. This chapter aims to introduce humans and their needs in interactions with robots, ranging from short-term, one-on-one interactions (micro-level) to long-term, macro-level needs at the societal scale. Building upon the principles of human-centered AI, this chapter presents, for the first time, a new framework of human needs called the Dual Pyramid. This framework encompasses a comprehensive list of human needs in robot interactions, from the most fundamental, robot effectiveness to macro level requirements, such as the collaboration with robots in achieving the United Nations 17 Sustainable Development Goals.

artificial intelligence, machine learning, robot, (20 more...)

2507.04095

Country:

Europe > Italy (0.04)
Europe > Greece (0.04)
North America > United States > New Jersey (0.04)
(16 more...)

Genre:

Research Report > New Finding (1.00)
Overview (0.93)
Instructional Material (0.92)
Research Report > Experimental Study (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Information Technology > Robotics & Automation (1.00)
Health & Medicine > Consumer Health (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Srivastava, Saksham Sahai, Aggarwal, Vaneet

A Technical Survey of Reinforcement Learning Techniques for Large Language Models

Reinforcement Learning (RL) has emerged as a transformative approach for aligning and enhancing Large Language Models (LLMs), addressing critical challenges in instruction following, ethical alignment, and reasoning capabilities. This survey offers a comprehensive foundation on the integration of RL with language models, highlighting prominent algorithms such as Proximal Policy Optimization (PPO), Q-Learning, and Actor-Critic methods. Additionally, it provides an extensive technical overview of RL techniques specifically tailored for LLMs, including foundational methods like Reinforcement Learning from Human Feedback (RLHF) and AI Feedback (RLAIF), as well as advanced strategies such as Direct Preference Optimization (DPO) and Group Relative Policy Optimization (GRPO). We systematically analyze their applications across domains, i.e., from code generation to tool-augmented reasoning. We also present a comparative taxonomy based on reward modeling, feedback mechanisms, and optimization strategies. Our evaluation highlights key trends. RLHF remains dominant for alignment, and outcome-based RL such as RLVR significantly improves stepwise reasoning. However, persistent challenges such as reward hacking, computational costs, and scalable feedback collection underscore the need for continued innovation. We further discuss emerging directions, including hybrid RL algorithms, verifier-guided training, and multi-objective alignment frameworks. This survey serves as a roadmap for researchers advancing RL-driven LLM development, balancing capability enhancement with safety and scalability.

large language model, machine learning, reinforcement learning, (16 more...)

2507.04136

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Colorado > Boulder County > Boulder (0.14)
South America > Colombia > Meta Department > Villavicencio (0.04)
(5 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.34)

Industry:

Education (1.00)
Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Beddar-Wiesing, Silvia, Moallemy-Oureh, Alice, Kempkes, Marie, Thomas, Josephine M.

Absolute Evaluation Measures for Machine Learning: A Survey

Machine Learning is a diverse field applied across various domains such as computer science, social sciences, medicine, chemistry, and finance. This diversity results in varied evaluation approaches, making it difficult to compare models effectively. Absolute evaluation measures offer a practical solution by assessing a model's performance on a fixed scale, independent of reference models and data ranges, enabling explicit comparisons. However, many commonly used measures are not universally applicable, leading to a lack of comprehensive guidance on their appropriate use. This survey addresses this gap by providing an overview of absolute evaluation metrics in ML, organized by the type of learning problem. While classification metrics have been extensively studied, this work also covers clustering, regression, and ranking metrics. By grouping these measures according to the specific ML challenges they address, this survey aims to equip practitioners with the tools necessary to select appropriate metrics for their models. The provided overview thus improves individual model evaluation and facilitates meaningful comparisons across different models and applications.

artificial intelligence, evaluation measure, machine learning, (17 more...)

2507.03392

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Samimi, Reza, Bhattacharya, Aditya, Gosak, Lucija, Stiglic, Gregor, Verbert, Katrien

Visual-Conversational Interface for Evidence-Based Explanation of Diabetes Risk Prediction

Healthcare professionals need effective ways to use, understand, and validate AI-driven clinical decision support systems. Existing systems face two key limitations: complex visualizations and a lack of grounding in scientific evidence. We present an integrated decision support system that combines interactive visualizations with a conversational agent to explain diabetes risk assessments. We propose a hybrid prompt handling approach combining fine-tuned language models for analytical queries with general Large Language Models (LLMs) for broader medical questions, a methodology for grounding AI explanations in scientific evidence, and a feature range analysis technique to support deeper understanding of feature contributions. We conducted a mixed-methods study with 30 healthcare professionals and found that the conversational interactions helped healthcare professionals build a clear understanding of model assessments, while the integration of scientific evidence calibrated trust in the system's decisions. Most participants reported that the system supported both patient risk evaluation and recommendation.

explanation, large language model, natural language, (18 more...)

2507.0292

Country:

North America > Canada > Ontario > Waterloo Region > Waterloo (0.05)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Europe > Slovenia > Drava > Municipality of Maribor > Maribor (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.93)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Lessons from a Chimp: AI "Scheming" and the Quest for Ape Language

Summerfield, Christopher, Luettgau, Lennart, Dubois, Magda, Kirk, Hannah Rose, Hackenburg, Kobi, Fist, Catherine, Slama, Katarina, Ding, Nicola, Anselmetti, Rebecca, Strait, Andrew, Giulianelli, Mario, Ududec, Cozmin

We examine recent research that asks whether current AI systems may be developing a capacity for "scheming" (covertly and strategically pursuing misaligned goals). We compare current research practices in this field to those adopted in the 1970s to test whether non-human primates could master natural language. We argue that there are lessons to be learned from that historical research endeavour, which was characterised by an overattribution of human traits to other agents, an excessive reliance on anecdote and descriptive analysis, and a failure to articulate a strong theoretical framework for the research. We recommend that research into AI scheming actively seeks to avoid these pitfalls. We outline some concrete steps that can be taken for this research programme to advance in a productive and scientifically rigorous fashion.

large language model, machine learning, natural language, (21 more...)

2507.03409

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Towards a Comparative Framework for Compositional AI Models

Duneau, Tiffany

The DisCoCirc framework for natural language processing allows the construction of compositional models of text, by combining units for individual words together according to the grammatical structure of the text. The compositional nature of a model can give rise to two things: compositional generalisation -- the ability of a model to generalise outside its training distribution by learning compositional rules underpinning the entire data distribution -- and compositional interpretability -- making sense of how the model works by inspecting its modular components in isolation, as well as the processes through which these components are combined. We present these notions in a framework-agnostic way using the language of category theory, and adapt a series of tests for compositional generalisation to this setting. Applying this to the DisCoCirc framework, we consider how well a selection of models can learn to compositionally generalise. We compare both quantum circuit based models, as well as classical neural networks, on a dataset derived from one of the bAbI tasks, extended to test a series of aspects of compositionality. Both architectures score within 5% of one another on the productivity and substitutivity tasks, but differ by at least 10% for the systematicity task, and exhibit different trends on the overgeneralisation tasks. Overall, we find the neural models are more prone to overfitting the Train data. Additionally, we demonstrate how to interpret a compositional model on one of the trained models. By considering how the model components interact with one another, we explain how the model behaves.

artificial intelligence, machine learning, natural language, (20 more...)

2507.0294

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre:

Research Report (0.64)
Instructional Material (0.46)
Overview (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)