Agents
AI for Better UX in Computer-Aided Engineering: Is Academia Catching Up with Industry Demands? A Multivocal Literature Review
Uulu, Choro Ulan, Kulyabin, Mikhail, Etaiwi, Layan, Pacheco, Nuno Miguel Martins, Joosten, Jan, Röse, Kerstin, Petridis, Filippos, Bosch, Jan, Olsson, Helena Holmström
Computer-Aided Engineering (CAE) enables simulation experts to optimize complex models, but faces challenges in user experience (UX) that limit efficiency and accessibility. While artificial intelligence (AI) has demonstrated potential to enhance CAE processes, research integrating these fields with a focus on UX remains fragmented. This paper presents a multivocal literature review (MLR) examining how AI enhances UX in CAE software across both academic research and industry implementations. Our analysis reveals significant gaps between academic explorations and industry applications, with companies actively implementing LLMs, adaptive UIs, and recommender systems while academic research focuses primarily on technical capabilities without UX validation. Key findings demonstrate opportunities in AI-powered guidance, adaptive interfaces, and workflow automation that remain underexplored in current research. By mapping the intersection of these domains, this study provides a foundation for future work to address the identified research gaps and advance the integration of AI to improve CAE user experience.
SVAgent: AI Agent for Hardware Security Verification Assertion
Guo, Rui, Ayalasomayajula, Avinash, Li, Henian, Zhou, Jingbo, Saha, Sujan Kumar, Farahmandi, Farimah
Verification using SystemVerilog assertions (SVA) is one of the most popular methods for detecting circuit design vulnerabilities. However, with the globalization of integrated circuit design and the continuous upgrading of security requirements, the SVA development model has exposed major limitations. It is not only inefficient in development, but also unable to effectively deal with the increasing number of security vulnerabilities in modern complex integrated circuits. In response to these challenges, this paper proposes an innovative SVA automatic generation framework SVAgent. SVAgent introduces a requirement decomposition mechanism to transform the original complex requirements into a structured, gradually solvable fine-grained problem-solving chain. Experiments have shown that SVAgent can effectively suppress the influence of hallucinations and random answers, and the key evaluation indicators such as the accuracy and consistency of the SVA are significantly better than existing frameworks. More importantly, we successfully integrated SVAgent into the most mainstream integrated circuit vulnerability assessment framework and verified its practicality and reliability in a real engineering design environment.
ADEPTS: A Capability Framework for Human-Centered Agent Design
D'Oro, Pierluca, Drooff, Caley, Chen, Joy, Tighe, Joseph
Large language models have paved the way to powerful and flexible AI agents, assisting humans by increasingly integrating into their daily life. This flexibility, potential, and growing adoption demands a holistic and cross-disciplinary approach to developing, monitoring and discussing the capabilities required for agent-driven user experiences. However, current guidance on human-centered AI agent development is scattered: UX heuristics focus on interface behaviors, engineering taxonomies describe internal pipelines, and ethics checklists address high-level governance. There is no concise, user-facing vocabulary that tells teams what an agent should fundamentally be able to do. We introduce ADEPTS, a capability framework defining a set of core user-facing capabilities to provide unified guidance around the development of AI agents. ADEPTS is based on six principles for human-centered agent design, that express the minimal, user-facing capabilities an AI agent should demonstrate to be understandable, controllable and trustworthy in everyday use. ADEPTS complements existing frameworks and taxonomies; differently from them, it sits at the interface between technical and experience development. By presenting ADEPTS, we aim to condense complex AI-UX requirements into a compact framework that is actionable guidance for AI researchers, designers, engineers, and policy reviewers alike. We believe ADEPTS has the potential of accelerating the improvement of user-relevant agent capabilities, of easing the design of experiences that take advantage of those capabilities, and of providing a shared language to track and discuss progress around the development of AI agents.
Empowering Virtual Agents With Intelligent Systems
While embodied AI is commonly understood as general-purpose intelligence that empowers various forms of robotics,9 we believe that its scope extends significantly beyond robotic platforms alone. Embodied AI, as we define it, refers to intelligent systems capable of learning from and actively interacting with their environments, continuously adapting based on real-time sensor feedback and context-driven decision-making. Specifically, we define Environmental Embodied AI as an intelligent virtual agent capable of real-time perception, learning, and interaction with its surrounding environment through sensor inputs, enabling it to actuate environmental elements, e.g. Distinct from traditional embodied AI systems primarily associated with robotic platforms, Environmental Embodied AI specifically emphasizes non-robotic applications, employing virtual agents to directly influence physical or operational states within environments. These intelligent systems autonomously analyze environmental data, dynamically adapting behaviors to optimize outcomes and significantly reduce ecological footprints, inherently supporting environmentally sustainable practices.
Semi-Supervised Federated Learning via Dual Contrastive Learning and Soft Labeling for Intelligent Fault Diagnosis
Dai, Yajiao, Li, Jun, Mei, Zhen, Ni, Yiyang, Jin, Shi, Li, Zengxiang, Guo, Sheng, Xiang, Wei
--Intelligent fault diagnosis (IFD) plays a crucial role in ensuring the safe operation of industrial machinery and improving production efficiency. However, traditional supervised deep learning methods require a large amount of training data and labels, which are often located in different clients. Additionally, the cost of data labeling is high, making labels difficult to acquire. Meanwhile, differences in data distribution among clients may also hinder the model's performance. T o tackle these challenges, this paper proposes a semi-supervised federated learning framework, SSFL-DCSL, which integrates dual contrastive loss and soft labeling to address data and label scarcity for distributed clients with few labeled samples while safeguarding user privacy. It enables representation learning using unlabeled data on the client side and facilitates joint learning among clients through prototypes, thereby achieving mutual knowledge sharing and preventing local model divergence. Specifically, first, a sample weighting function based on the Laplace distribution is designed to alleviate bias caused by low confidence in pseudo labels during the semi-supervised training process. Second, a dual contrastive loss is introduced to mitigate model divergence caused by different data distributions, comprising local contrastive loss and global contrastive loss. Third, local prototypes are aggregated on the server with weighted averaging and updated with momentum to share knowledge among clients. T o evaluate the proposed SSFL-DCSL framework, experiments are conducted on two publicly available datasets and a dataset collected on motors from the factory. In the most challenging task, where only 10% of the data are labeled, the proposed SSFL-DCSL can improve accuracy by 1.15% to 7.85% over state-of-the-art methods. Dai and Z. Mei are with the School of Electronic and Optical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China (e-mail: { yajiao.dai, J. Li and S. Jin are with the School of Information Science and Engineering, Southeast University, Nanjing, 210096, China (e-mail: jun.li, jinshi@seu.edu.cn).
The Emergence of Deep Reinforcement Learning for Path Planning
Nguyen, Thanh Thi, Nahavandi, Saeid, Razzak, Imran, Nguyen, Dung, Pham, Nhat Truong, Nguyen, Quoc Viet Hung
The increasing demand for autonomous systems in complex and dynamic environments has driven significant research into intelligent path planning methodologies. For decades, graph-based search algorithms, linear programming techniques, and evolutionary computation methods have served as foundational approaches in this domain. Recently, deep reinforcement learning (DRL) has emerged as a powerful method for enabling autonomous agents to learn optimal navigation strategies through interaction with their environments. This survey provides a comprehensive overview of traditional approaches as well as the recent advancements in DRL applied to path planning tasks, focusing on autonomous vehicles, drones, and robotic platforms. Key algorithms across both conventional and learning-based paradigms are categorized, with their innovations and practical implementations highlighted. This is followed by a thorough discussion of their respective strengths and limitations in terms of computational efficiency, scalability, adaptability, and robustness. The survey concludes by identifying key open challenges and outlining promising avenues for future research. Special attention is given to hybrid approaches that integrate DRL with classical planning techniques to leverage the benefits of both learning-based adaptability and deterministic reliability, offering promising directions for robust and resilient autonomous navigation.
LLM Economist: Large Population Models and Mechanism Design in Multi-Agent Generative Simulacra
Karten, Seth, Li, Wenzhe, Ding, Zihan, Kleiner, Samuel, Bai, Yu, Jin, Chi
We present the LLM Economist, a novel framework that uses agent-based modeling to design and assess economic policies in strategic environments with hierarchical decision-making. At the lower level, bounded rational worker agents -- instantiated as persona-conditioned prompts sampled from U.S. Census-calibrated income and demographic statistics -- choose labor supply to maximize text-based utility functions learned in-context. At the upper level, a planner agent employs in-context reinforcement learning to propose piecewise-linear marginal tax schedules anchored to the current U.S. federal brackets. This construction endows economic simulacra with three capabilities requisite for credible fiscal experimentation: (i) optimization of heterogeneous utilities, (ii) principled generation of large, demographically realistic agent populations, and (iii) mechanism design -- the ultimate nudging problem -- expressed entirely in natural language. Experiments with populations of up to one hundred interacting agents show that the planner converges near Stackelberg equilibria that improve aggregate social welfare relative to Saez solutions, while a periodic, persona-level voting procedure furthers these gains under decentralized governance. These results demonstrate that large language model-based agents can jointly model, simulate, and govern complex economic systems, providing a tractable test bed for policy evaluation at the societal scale to help build better civilizations.
A Framework for Analyzing Abnormal Emergence in Service Ecosystems Through LLM-based Agent Intention Mining
Shen, Yifan, Zhao, Zihan, Xue, Xiao, Guo, Yuwei, Ma, Qun, Zhou, Deyu, Zhang, Ming
With the rise of service computing, cloud computing, and IoT, service ecosystems are becoming increasingly complex. The intricate interactions among intelligent agents make abnormal emergence analysis challenging, as traditional causal methods focus on individual trajectories. Large language models offer new possibilities for Agent-Based Modeling (ABM) through Chain-of-Thought (CoT) reasoning to reveal agent intentions. However, existing approaches remain limited to microscopic and static analysis. This paper introduces a framework: Emergence Analysis based on Multi-Agent Intention (EAMI), which enables dynamic and interpretable emergence analysis. EAMI first employs a dual-perspective thought track mechanism, where an Inspector Agent and an Analysis Agent extract agent intentions under bounded and perfect rationality. Then, k-means clustering identifies phase transition points in group intentions, followed by a Intention Temporal Emergence diagram for dynamic analysis. The experiments validate EAMI in complex online-to-offline (O2O) service system and the Stanford AI Town experiment, with ablation studies confirming its effectiveness, generalizability, and efficiency. This framework provides a novel paradigm for abnormal emergence and causal analysis in service ecosystems. The code is available at https://anonymous.4open.science/r/EAMI-B085.
GasAgent: A Multi-Agent Framework for Automated Gas Optimization in Smart Contracts
Zheng, Jingyi, Peng, Zifan, Liu, Yule, Wang, Junfeng, Liao, Yifan, Dong, Wenhan, He, Xinlei
Smart contracts are trustworthy, immutable, and automatically executed programs on the blockchain. Their execution requires the Gas mechanism to ensure efficiency and fairness. However, due to non-optimal coding practices, many contracts contain Gas waste patterns that need to be optimized. Existing solutions mostly rely on manual discovery, which is inefficient, costly to maintain, and difficult to scale. Recent research uses large language models (LLMs) to explore new Gas waste patterns. However, it struggles to remain compatible with existing patterns, often produces redundant patterns, and requires manual validation/rewriting. To address this gap, we present GasAgent, the first multi-agent system for smart contract Gas optimization that combines compatibility with existing patterns and automated discovery/validation of new patterns, enabling end-to-end optimization. GasAgent consists of four specialized agents, Seeker, Innovator, Executor, and Manager, that collaborate in a closed loop to identify, validate, and apply Gas-saving improvements. Experiments on 100 verified real-world contracts demonstrate that GasAgent successfully optimizes 82 contracts, achieving an average deployment Gas savings of 9.97%. In addition, our evaluation confirms its compatibility with existing tools and validates the effectiveness of each module through ablation studies. To assess broader usability, we further evaluate 500 contracts generated by five representative LLMs across 10 categories and find that GasAgent optimizes 79.8% of them, with deployment Gas savings ranging from 4.79% to 13.93%, showing its usability as the optimization layer for LLM-assisted smart contract development.
Towards physician-centered oversight of conversational diagnostic AI
Vedadi, Elahe, Barrett, David, Harris, Natalie, Wulczyn, Ellery, Reddy, Shashir, Ruparel, Roma, Schaekermann, Mike, Strother, Tim, Tanno, Ryutaro, Sharma, Yash, Lee, Jihyeon, Hughes, Cían, Slack, Dylan, Palepu, Anil, Freyberg, Jan, Saab, Khaled, Liévin, Valentin, Weng, Wei-Hung, Tu, Tao, Liu, Yun, Tomasev, Nenad, Kulkarni, Kavita, Mahdavi, S. Sara, Guu, Kelvin, Barral, Joëlle, Webster, Dale R., Manyika, James, Hassidim, Avinatan, Chou, Katherine, Matias, Yossi, Kohli, Pushmeet, Rodman, Adam, Natarajan, Vivek, Karthikesalingam, Alan, Stutz, David
Recent work has demonstrated the promise of conversational AI systems for diagnostic dialogue. However, real-world assurance of patient safety means that providing individual diagnoses and treatment plans is considered a regulated activity by licensed professionals. Furthermore, physicians commonly oversee other team members in such activities, including nurse practitioners (NPs) or physician assistants/associates (PAs). Inspired by this, we propose a framework for effective, asynchronous oversight of the Articulate Medical Intelligence Explorer (AMIE) AI system. We propose guardrailed-AMIE (g-AMIE), a multi-agent system that performs history taking within guardrails, abstaining from individualized medical advice. Afterwards, g-AMIE conveys assessments to an overseeing primary care physician (PCP) in a clinician cockpit interface. The PCP provides oversight and retains accountability of the clinical decision. This effectively decouples oversight from intake and can thus happen asynchronously. In a randomized, blinded virtual Objective Structured Clinical Examination (OSCE) of text consultations with asynchronous oversight, we compared g-AMIE to NPs/PAs or a group of PCPs under the same guardrails. Across 60 scenarios, g-AMIE outperformed both groups in performing high-quality intake, summarizing cases, and proposing diagnoses and management plans for the overseeing PCP to review. This resulted in higher quality composite decisions. PCP oversight of g-AMIE was also more time-efficient than standalone PCP consultations in prior work. While our study does not replicate existing clinical practices and likely underestimates clinicians' capabilities, our results demonstrate the promise of asynchronous oversight as a feasible paradigm for diagnostic AI systems to operate under expert human oversight for enhancing real-world care.