AITopics

2503.13657

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.67)
Information Technology (0.67)
Leisure & Entertainment > Games (0.46)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Siedler, Philipp D., Gemp, Ian

LLM-Mediated Guidance of MARL Systems

In complex multi-agent environments, achieving efficient learning and desirable behaviours is a significant challenge for Multi-Agent Reinforcement Learning (MARL) systems. This work explores the potential of combining MARL with Large Language Model (LLM)-mediated interventions to guide agents toward more desirable behaviours. Specifically, we investigate how LLMs can be used to interpret and facilitate interventions that shape the learning trajectories of multiple agents. We experimented with two types of interventions, referred to as controllers: a Natural Language (NL) Controller and a Rule-Based (RB) Controller. The NL Controller, which uses an LLM to simulate human-like interventions, showed a stronger impact than the RB Controller. Our findings indicate that agents particularly benefit from early interventions, leading to more efficient training and higher performance. Both intervention types outperform the baseline without interventions, highlighting the potential of LLM-mediated guidance to accelerate training and enhance MARL performance in challenging environments.

large language model, machine learning, natural language, (16 more...)

2503.13553

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Shah, Shraddha Pradipbhai, Deshpande, Aditya Vilas

Enforcing Cybersecurity Constraints for LLM-driven Robot Agents for Online Transactions

The integration of Large Language Models (LLMs) into autonomous robotic agents for conducting online transactions poses significant cybersecurity challenges. This study aims to enforce robust cybersecurity constraints to mitigate the risks associated with data breaches, transaction fraud, and system manipulation. The background focuses on the rise of LLM-driven robotic systems in e-commerce, finance, and service industries, alongside the vulnerabilities they introduce. A novel security architecture combining blockchain technology with multi-factor authentication (MFA) and real-time anomaly detection was implemented to safeguard transactions. Key performance metrics such as transaction integrity, response time, and breach detection accuracy were evaluated, showing improved security and system performance. The results highlight that the proposed architecture reduced fraudulent transactions by 90%, improved breach detection accuracy to 98%, and ensured secure transaction validation within a latency of 0.05 seconds. These findings emphasize the importance of cybersecurity in the deployment of LLM-driven robotic systems and suggest a framework adaptable to various online platforms.

artificial intelligence, large language model, natural language, (19 more...)

2503.15546

Country: Europe > Switzerland (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Ong, Kenneth J. K., Jun, Lye Jia, Nguyen, Hieu Minh "Jord", Cho, Seong Hah, Antolín, Natalia Pérez-Campanero

Identifying Cooperative Personalities in Multi-agent Contexts through Personality Steering with Representation Engineering

As Large Language Models (LLMs) gain autonomous capabilities, their coordination in multi-agent settings becomes increasingly important. However, they often struggle with cooperation, leading to suboptimal outcomes. Inspired by Axelrod's Iterated Prisoner's Dilemma (IPD) tournaments, we explore how personality traits influence LLM cooperation. Using representation engineering, we steer Big Five traits (e.g., Agreeableness, Conscientiousness) in LLMs and analyze their impact on IPD decision-making. Our results show that higher Agreeableness and Conscientiousness improve cooperation but increase susceptibility to exploitation, highlighting both the potential and limitations of personality-based steering for aligning AI agents.

artificial intelligence, large language model, natural language, (16 more...)

2503.12722

Country:

North America > Mexico > Mexico City > Mexico City (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Advancing Human-Machine Teaming: Concepts, Challenges, and Applications

Chen, Dian, Yoon, Han Jun, Wan, Zelin, Alluru, Nithin, Lee, Sang Won, He, Richard, Moore, Terrence J., Nelson, Frederica F., Yoon, Sunghyun, Lim, Hyuk, Kim, Dan Dongseong, Cho, Jin-Hee

Human-Machine Teaming (HMT) is revolutionizing collaboration across domains such as defense, healthcare, and autonomous systems by integrating AI-driven decision-making, trust calibration, and adaptive teaming. This survey presents a comprehensive taxonomy of HMT, analyzing theoretical models, including reinforcement learning, instance-based learning, and interdependence theory, alongside interdisciplinary methodologies. Unlike prior reviews, we examine team cognition, ethical AI, multi-modal interactions, and real-world evaluation frameworks. Key challenges include explainability, role allocation, and scalable benchmarking. We propose future research in cross-domain adaptation, trust-aware AI, and standardized testbeds. By bridging computational and social sciences, this work lays a foundation for resilient, ethical, and scalable HMT systems.

data mining, machine learning, real time system, (20 more...)

2503.16518

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Virginia > Arlington County > Arlington (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
(10 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Transportation (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Consumer Health (1.00)
(5 more...)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Collaboration (1.00)
(8 more...)

VeriLA: A Human-Centered Evaluation Framework for Interpretable Verification of LLM Agent Failures

Sung, Yoo Yeon, Kim, Hannah, Zhang, Dan

AI practitioners increasingly use large language model (LLM) agents in compound AI systems to solve complex reasoning tasks, these agent executions often fail to meet human standards, leading to errors that compromise the system's overall performance. Addressing these failures through human intervention is challenging due to the agents' opaque reasoning processes, misalignment with human expectations, the complexity of agent dependencies, and the high cost of manual inspection. This paper thus introduces a human-centered evaluation framework for Verifying LLM Agent failures (VeriLA), which systematically assesses agent failures to reduce human effort and make these agent failures interpretable to humans. The framework first defines clear expectations of each agent by curating human-designed agent criteria. Then, it develops a human-aligned agent verifier module, trained with human gold standards, to assess each agent's execution output. This approach enables granular evaluation of each agent's performance by revealing failures from a human standard, offering clear guidelines for revision, and reducing human cognitive load. Our case study results show that VeriLA is both interpretable and efficient in helping practitioners interact more effectively with the system. By upholding accountability in human-agent collaboration, VeriLA paves the way for more trustworthy and human-aligned compound AI systems.

artificial intelligence, large language model, natural language, (17 more...)

2503.12651

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > Maryland (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Synthetic Data for Robust AI Model Development in Regulated Enterprises

Godbole, Aditi

In today's business landscape, organizations need to find the right balance between using their customers' data ethically to power AI solutions and being compliant regarding data privacy and data usage regulations. In this paper, we discuss synthetic data as a possible solution to this dilemma. Synthetic data is simulated data that mimics the real data. We explore how organizations in heavily regulated industries, such as financial institutions or healthcare organizations, can leverage synthetic data to build robust AI solutions while staying compliant. We demonstrate that synthetic data offers two significant advantages by allowing AI models to learn from more diverse data and by helping organizations stay compliant against data privacy laws with the use of synthetic data instead of customer information. We discuss case studies to show how synthetic data can be effectively used in the finance and healthcare sector while discussing the challenges of using synthetic data and some ethical questions it raises. Our research finds that synthetic data could be a game-changer for AI in regulated industries. The potential can be realized when industry, academia, and regulators collaborate to build solutions. We aim to initiate discussions on the use of synthetic data to build ethical, responsible, and effective AI systems in regulated enterprise industries.

artificial intelligence, machine learning, natural language, (17 more...)

2503.12353

Country:

North America > United States > Washington > King County > Bellevue (0.04)
Asia > Middle East > Oman > Muscat Governorate > Muscat (0.04)

Genre: Research Report > Promising Solution (0.66)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
(2 more...)

AI Agents: Evolution, Architecture, and Real-World Applications

Krishnan, Naveen

Artificial Intelligence (AI) has evolved dramatically over the past decade, transitioning from specialized systems designed for narrow tasks to increasingly sophisticated architectures capable of autonomous operation across diverse domains. Among these advancements, AI agents represent a particularly significant development, embodying a paradigm shift in how intelligent systems interact with their environments, make decisions, and achieve complex goals. Unlike traditional AI systems that execute predefined algorithms within constraints, AI agents possess the capacity to autonomously perceive, reason, and act, often adapting their behavior based on environmental feedback and accumulated experience. The concept of an AI agent refers to a system or program that is capable of autonomously performing tasks on behalf of a user or another system by designing its workflow and utilizing available tools. These agents can encompass a wide range of functionalities beyond natural language processing, including decision making, problem solving, interacting with external environments, and executing actions. As Kapoor et al. (2024) note in their analysis of agent benchmarks, the development of AI agents represents an exciting new research direction with significant implications for real-world applications across numerous industries. The evolution of AI agents has been accelerated by recent breakthroughs in large language models (LLMs), which have provided a foundation for more sophisticated reasoning capabilities. Modern AI agents leverage these advanced language models as core components, augmenting them with specialized modules for memory, planning, tool use, and environmental interaction. This integration enables agents to perform complex tasks that would be challenging or impossible for traditional AI systems, from reconciling financial statements to providing step-by-step instructions for field technicians based on contextual understanding of product information.

artificial intelligence, machine learning, natural language, (21 more...)

2503.12687

Genre: Instructional Material (1.00)

Industry:

Transportation (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

STEVE: AStep Verification Pipeline for Computer-use Agent Training

Lu, Fanbin, Zhong, Zhisheng, Wei, Ziqin, Liu, Shu, Fu, Chi-Wing, Jia, Jiaya

Developing AI agents to autonomously manipulate graphical user interfaces is a long challenging task. Recent advances in data scaling law inspire us to train computer-use agents with a scaled instruction set, yet using behavior cloning to train agents still requires immense high-quality trajectories. To meet the scalability need, we designed STEVE, a step verification pipeline for computer-use agent training. First, we establish a large instruction set for computer-use agents and collect trajectory data with some suboptimal agents. GPT-4o is used to verify the correctness of each step in the trajectories based on the screens before and after the action execution, assigning each step with a binary label. Last, we adopt the Kahneman and Tversky Optimization to optimize the agent from the binary stepwise labels. Extensive experiments manifest that our agent outperforms supervised finetuning by leveraging both positive and negative actions within a trajectory. Also, STEVE enables us to train a 7B vision-language model as a computer-use agent, achieving leading performance in the challenging live desktop environment WinAgentArena with great efficiency at a reduced cost. Code and data: https://github.com/FanbinLu/STEVE.

large language model, machine learning, natural language, (20 more...)

2503.12532

Country:

North America > United States > New York (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report (1.00)
Workflow (0.70)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

CDKFormer: Contextual Deviation Knowledge-Based Transformer for Long-Tail Trajectory Prediction

Lian, Yuansheng, Zhang, Ke, Li, Meng

Predicting the future movements of surrounding vehicles is essential for ensuring the safe operation and efficient navigation of autonomous vehicles (AVs) in urban traffic environments. Existing vehicle trajectory prediction methods primarily focus on improving overall performance, yet they struggle to address long-tail scenarios effectively. This limitation often leads to poor predictions in rare cases, significantly increasing the risk of safety incidents. Taking Argoverse 2 motion forecasting dataset as an example, we first investigate the long-tail characteristics in trajectory samples from two perspectives, individual motion and group interaction, and deriving deviation features to distinguish abnormal from regular scenarios. On this basis, we propose CDKFormer, a Contextual Deviation Knowledge-based Transformer model for long-tail trajectory prediction. CDKFormer integrates an attention-based scene context fusion module to encode spatiotemporal interaction and road topology. An additional deviation feature fusion module is proposed to capture the dynamic deviations in the target vehicle status. We further introduce a dual query-based decoder, supported by a multi-stream decoder block, to sequentially decode heterogeneous scene deviation features and generate multimodal trajectory predictions. Extensive experiments demonstrate that CDKFormer achieves state-of-the-art performance, significantly enhancing prediction accuracy and robustness for long-tailed trajectories compared to existing methods, thus advancing the reliability of AVs in complex real-world environments.

artificial intelligence, machine learning, prediction, (17 more...)

2503.12695

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry:

Transportation > Infrastructure & Services (0.46)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.88)