agent framework
Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation
This paper introduces a novel approach using Large Language Models (LLMs) integrated into an agent framework for flexible and effective personal mobility generation. LLMs overcome the limitations of previous models by effectively processing semantic data and offering versatility in modeling various tasks.
Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction
AGI Team, null, Cai, Yuxuan, Chen, Lu, Chen, Qiaoling, Ding, Yuyang, Fan, Liwen, Fu, Wenjie, Gao, Yufei, Guo, Honglin, Guo, Pinxue, Han, Zhenhua, He, Zhengfu, Hu, Hanglei, Hu, Kai, Hua, Shengjia, Huai, Tianyu, Huang, Baodai, Ji, Li, Jiang, Zhen, Lei, Zhikai, Li, Bufan, Lin, Jiahang, Lin, Lizhi, Liu, Jinxiu, Liu, Shichun, Liu, Ziming, Ni, Yuchen, Qian, Pengfang, Shen, Yujiong, Shi, Qingyun, Shu, Wentao, Sun, Peng, Suo, Yiran, Tang, Tian, Tian, Boyu, Wang, Guoteng, Wang, Junzhe, Wang, Peixin, Xi, Zhiheng, Yan, Hang, Yang, Jie, Yang, Zhixiong, Yao, Tianchu, Ye, Guangze, Yu, Qianxi, Zhang, Shuo, Zhang, Xinyue, Zhang, Yiqi, Zhao, Jiarong, Zheng, Miao, Zheng, Rui, Zhou, Enyu, Zhou, Jiazheng, Zhou, Maosen, Zhou, Yuhao, Gui, Tao, Zheng, Yining, Chen, Xinchi, Zhou, Jie, Feng, Siyuan, Chen, Qin, He, Liang, Zhang, Qi, Huang, Xuanjing, Qiu, Xipeng
The evolution of Large Language Models (LLMs) from passive responders to autonomous agents necessitates a fundamental shift in learning paradigms -- from static imitation to incentive-driven decision making. However, this transition is significantly impeded by the lack of scalable infrastructure capable of constructing high-quality interaction signals for effective policy learning. To address this, we introduce a comprehensive method designed to systematically scale the diversity and complexity of interactive environments. Our method realizes this scaling by addressing three orthogonal dimensions: (1) Complexity: NexAU, a flexible agent framework that supports building complex agent hierarchies via simple configurations; (2) Diversity: NexA4A automatically generates diverse agent hierarchies from natural language to cover infinite domains; and (3) Fidelity: NexGAP bridges the simulation-reality gap by integrating dynamic real-world environment for grounded trajectories synthesis. We train Nex-N1 upon the diverse and complex interactive environments established by our infrastructure. Empirical results on benchmarks such as SWE-bench and tau2 demonstrate that Nex-N1 consistently outperforms SOTA open-source models and achieves competitive performance against frontier proprietary models on complex agentic tasks. We open-source the Nex ecosystem and model weights to facilitate further research.
An Empirical Study of Agent Developer Practices in AI Agent Frameworks
Wang, Yanlin, Xu, Xinyi, Chen, Jiachi, Bi, Tingting, Gu, Wenchao, Zheng, Zibin
The rise of large language models (LLMs) has sparked a surge of interest in agents, leading to the rapid growth of agent frameworks. Agent frameworks are software toolkits and libraries that provide standardized components, abstractions, and orchestration mechanisms to simplify agent development. Despite widespread use of agent frameworks, their practical applications and how they influence the agent development process remain underexplored. Different agent frameworks encounter similar problems during use, indicating that these recurring issues deserve greater attention and call for further improvements in agent framework design. Meanwhile, as the number of agent frameworks continues to grow and evolve, more than 80% of developers report difficulties in identifying the frameworks that best meet their specific development requirements. In this paper, we conduct the first empirical study of LLM-based agent frameworks, exploring real-world experiences of developers in building AI agents. To compare how well the agent frameworks meet developer needs, we further collect developer discussions for the ten previously identified agent frameworks, resulting in a total of 11,910 discussions. Finally, by analyzing these discussions, we compare the frameworks across five dimensions: development efficiency, functional abstraction, learning cost, performance optimization, and maintainability, which refers to how easily developers can update and extend both the framework itself and the agents built upon it over time. Our comparative analysis reveals significant differences among frameworks in how they meet the needs of agent developers. Overall, we provide a set of findings and implications for the LLM-driven AI agent framework ecosystem and offer insights for the design of future LLM-based agent frameworks and agent developers.
- Workflow (1.00)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Research Report > Experimental Study (0.67)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Banking & Finance (0.67)
LiteCUA: Computer as MCP Server for Computer-Use Agent on AIOS
Mei, Kai, Zhu, Xi, Gao, Hang, Lin, Shuhang, Zhang, Yongfeng
We present AIOS 1.0, a novel platform designed to advance computer-use agent (CUA) capabilities through environmental contextualization. While existing approaches primarily focus on building more powerful agent frameworks or enhancing agent models, we identify a fundamental limitation: the semantic disconnect between how language models understand the world and how computer interfaces are structured. AIOS 1.0 addresses this challenge by transforming computers into contextual environments that language models can natively comprehend, implementing a Model Context Protocol (MCP) server architecture to abstract computer states and actions. This approach effectively decouples interface complexity from decision complexity, enabling agents to reason more effectively about computing environments. To demonstrate our platform's effectiveness, we introduce LiteCUA, a lightweight computer-use agent built on AIOS 1.0 that achieves a 14.66% success rate on the OSWorld benchmark, outperforming several specialized agent frameworks despite its simple architecture. Our results suggest that contextualizing computer environments for language models represents a promising direction for developing more capable computer-use agents and advancing toward AI that can interact with digital systems.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Interactive Environmental Learning in Physical Embodied Systems
Lei, Mingcong, Cai, Honghao, Cui, Zezhou, Tan, Liangchen, Hong, Junkun, Hu, Gehan, Zhu, Shuangyu, Wu, Yimou, Jiang, Shaohan, Wang, Ge, Yang, Yuyuan, Tan, Junyuan, Wan, Zhenglin, Li, Zhen, Cui, Shuguang, Zhao, Yiming, Han, Yatong
Embodied agents face persistent challenges in real-world environments, including partial observability, limited spatial reasoning, and high-latency multi-memory integration. We present RoboMemory, a brain-inspired framework that unifies Spatial, Temporal, Episodic, and Semantic memory under a parallelized architecture for efficient long-horizon planning and interactive environmental learning. A dynamic spatial knowledge graph (KG) ensures scalable and consistent memory updates, while a closed-loop planner with a critic module supports adaptive decision-making in dynamic settings. Experiments on EmbodiedBench show that RoboMemory, built on Qwen2.5-VL-72B-Ins, improves average success rates by 25% over its baseline and exceeds the closed-source state-of-the-art (SOTA) Gemini-1.5-Pro by 3%. Real-world trials further confirm its capacity for cumulative learning, with performance improving across repeated tasks. These results highlight RoboMemory as a scalable foundation for memory-augmented embodied intelligence, bridging the gap between cognitive neuroscience and robotic autonomy.
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Heilongjiang Province > Harbin (0.04)
- (2 more...)
Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations
Liu, Chengzhi, Yang, Yuzhe, Zhou, Kaiwen, Zhang, Zhen, Fan, Yue, Xie, Yanan, Qi, Peng, Wang, Xin Eric
The promotion of academic papers has become an important means of enhancing research visibility. However, existing automated methods struggle limited storytelling, insufficient aesthetic quality, and constrained self-adjustment, making it difficult to achieve efficient and engaging dissemination. At the heart of those challenges is a simple principle: \emph{there is no way to improve it when you cannot evaluate it right}. To address this, we introduce \textbf{EvoPresent}, a self-improvement agent framework that unifies coherent narratives, aesthetic-aware designs, and realistic presentation delivery via virtual characters. Central to EvoPresent is \textbf{PresAesth}, a multi-task reinforcement learning (RL) aesthetic model that provides reliable aesthetic scoring, defect adjustment, and comparative feedback, enabling iterative self-improvement even under limited aesthetic training data. To systematically evaluate the methods, we introduce \textbf{EvoPresent Benchmark}, a comprehensive benchmark comprising: \textit{Presentation Generation Quality}, built on 650 top-tier AI conference papers with multimodal resources (slides, videos and scripts) to assess both content and design; and \textit{Aesthetic Awareness}, consisting of 2,000 slide pairs with varying aesthetic levels, supporting joint training and evaluation on scoring, defect adjustment, and comparison. Our findings highlight that (i) High-quality feedback is essential for agent self-improvement, while initial capability alone does not guarantee effective self-correction. (ii) Automated generation pipelines exhibit a trade-off between visual design and content construction. (iii) Multi-task RL training shows stronger generalization in aesthetic awareness tasks.
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Vision (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Exploring Autonomous Agents: A Closer Look at Why They Fail When Completing Tasks
Lu, Ruofan, Li, Yichen, Huo, Yintong
Autonomous agent systems powered by Large Language Models (LLMs) have demonstrated promising capabilities in automating complex tasks. However, current evaluations largely rely on success rates without systematically analyzing the interactions, communication mechanisms, and failure causes within these systems. To bridge this gap, we present a benchmark of 34 representative programmable tasks designed to rigorously assess autonomous agents. Using this benchmark, we evaluate three popular open-source agent frameworks combined with two LLM backbones, observing a task completion rate of approximately 50%. Through in-depth failure analysis, we develop a three-tier taxonomy of failure causes aligned with task phases, highlighting planning errors, task execution issues, and incorrect response generation. Based on these insights, we propose actionable improvements to enhance agent planning and self-diagnosis capabilities. Our failure taxonomy, together with mitigation advice, provides an empirical foundation for developing more robust and effective autonomous agent systems in the future.
Securing Agentic AI: Threat Modeling and Risk Analysis for Network Monitoring Agentic AI System
Zambare, Pallavi, Thanikella, Venkata Nikhil, Liu, Ying
When combining Large Language Models (LLMs) with autonomous agents, used in network monitoring and decision-making systems, this will create serious security issues. In this research, the MAESTRO framework consisting of the seven layers threat modeling architecture in the system was used to expose, evaluate, and eliminate vulnerabilities of agentic AI. The prototype agent system was constructed and implemented, using Python, LangChain, and telemetry in WebSockets, and deployed with inference, memory, parameter tuning, and anomaly detection modules. Two practical threat cases were confirmed as follows: (i) resource denial of service by traffic replay denial-of-service, and (ii) memory poisoning by tampering with the historical log file maintained by the agent. These situations resulted in measurable levels of performance degradation, i.e. telemetry updates were delayed, and computational loads were increased, as a result of poor system adaptations. It was suggested to use a multilayered defense-in-depth approach with memory isolation, validation of planners and anomaly response systems in real-time. These findings verify that MAESTRO is viable in operational threat mapping, prospective risk scoring, and the basis of the resilient system design. The authors bring attention to the importance of the enforcement of memory integrity, paying attention to the adaptation logic monitoring, and cross-layer communication protection that guarantee the agentic AI reliability in adversarial settings.
- North America > United States > North Carolina (0.04)
- Europe > Switzerland (0.04)
- Asia > Singapore (0.04)
- (2 more...)
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
Fang, Tianqing, Zhang, Zhisong, Wang, Xiaoyang, Wang, Rui, Qin, Can, Wan, Yuxuan, Ma, Jun-Yu, Zhang, Ce, Chen, Jiaqi, Li, Xiyun, Zhang, Hongming, Mi, Haitao, Yu, Dong
General AI Agents are increasingly recognized as foundational frameworks for the next generation of artificial intelligence, enabling complex reasoning, web interaction, coding, and autonomous research capabilities. However, current agent systems are either closed-source or heavily reliant on a variety of paid APIs and proprietary tools, limiting accessibility and reproducibility for the research community. In this work, we present \textbf{Cognitive Kernel-Pro}, a fully open-source and (to the maximum extent) free multi-module agent framework designed to democratize the development and evaluation of advanced AI agents. Within Cognitive Kernel-Pro, we systematically investigate the curation of high-quality training data for Agent Foundation Models, focusing on the construction of queries, trajectories, and verifiable answers across four key domains: web, file, code, and general reasoning. Furthermore, we explore novel strategies for agent test-time reflection and voting to enhance agent robustness and performance. We evaluate Cognitive Kernel-Pro on GAIA, achieving state-of-the-art results among open-source and free agents. Notably, our 8B-parameter open-source model surpasses previous leading systems such as WebDancer and WebSailor, establishing a new performance standard for accessible, high-capability AI agents. Code is available at https://github.com/Tencent/CognitiveKernel-Pro
- North America > United States (0.14)
- Europe > Austria > Vienna (0.14)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (3 more...)
- Research Report (1.00)
- Workflow (0.68)
Efficient Agents: Building Effective Agents While Reducing Cost
Wang, Ningning, Hu, Xavier, Liu, Pai, Zhu, He, Hou, Yue, Huang, Heyuan, Zhang, Shengyu, Yang, Jian, Liu, Jiaheng, Zhang, Ge, Zhang, Changwang, Wang, Jun, Jiang, Yuchen Eleanor, Zhou, Wangchunshu
The remarkable capabilities of Large Language Model (LLM)-driven agents have enabled sophisticated systems to tackle complex, multi-step tasks, but their escalating costs threaten scalability and accessibility. This work presents the first systematic study of the efficiency-effectiveness trade-off in modern agent systems, addressing the critical need for cost-effective designs without sacrificing performance. We investigate three key questions: (1) How much complexity do agentic tasks inherently require? (2) When do additional modules yield diminishing returns? (3) How much efficiency can be gained through the design of efficient agent frameworks? Through an empirical analysis on the GAIA benchmark, we evaluate the impact of LLM backbone selection, agent framework designs, and test-time scaling strategies. Using the cost-of-pass metric, we quantify the efficiency-performance trade-off across these dimensions. Our findings inform the development of Efficient Agents , a novel agent framework that has an optimal complexity to task requirements. Efficient Agents retains 96.7% of the performance of OWL, one leading open-source agent framework, while reducing operational costs from $0.398 to $0.228, resulting in a 28.4% improvement in cost-of-pass. Our work provides actionable insights for designing efficient, high-performing agent systems, advancing the accessibility and sustainability of AI-driven solutions.
- Asia > Middle East > Jordan (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- South America > Colombia > Meta Department > Villavicencio (0.04)
- (3 more...)