AITopics | Hoang, Thai

Plotting

Hoang, Thai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ActionStudio: A Lightweight Framework for Data and Training of Large Action Models

Zhang, Jianguo, Hoang, Thai, Zhu, Ming, Liu, Zuxin, Wang, Shiyu, Awalgaonkar, Tulika, Prabhakar, Akshara, Chen, Haolin, Yao, Weiran, Liu, Zhiwei, Tan, Juntao, Niebles, Juan Carlos, Heinecke, Shelby, Wang, Huan, Savarese, Silvio, Xiong, Caiming

arXiv.org Artificial IntelligenceMar-31-2025

Action models are essential for enabling autonomous agents to perform complex tasks. However, training large action models remains challenging due to the diversity of agent environments and the complexity of agentic data. Despite growing interest, existing infrastructure provides limited support for scalable, agent-specific fine-tuning. We present ActionStudio, a lightweight and extensible data and training framework designed for large action models. ActionStudio unifies heterogeneous agent trajectories through a standardized format, supports diverse training paradigms including LoRA, full fine-tuning, and distributed setups, and integrates robust preprocessing and verification tools. We validate its effectiveness across both public and realistic industry benchmarks, demonstrating strong performance and practical scalability. We open-sourced code and data at https://github.com/SalesforceAIResearch/xLAM to facilitate research in the community.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.22673

Genre: Research Report (0.64)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs

Kokane, Shirley, Zhu, Ming, Awalgaonkar, Tulika, Zhang, Jianguo, Hoang, Thai, Prabhakar, Akshara, Liu, Zuxin, Lan, Tian, Yang, Liangwei, Tan, Juntao, Murthy, Rithesh, Yao, Weiran, Liu, Zhiwei, Niebles, Juan Carlos, Wang, Huan, Heinecke, Shelby, Xiong, Caiming, Savarese, Silivo

arXiv.org Artificial IntelligenceNov-20-2024

Evaluating the output of Large Language Models (LLMs) is one of the most critical aspects of building a performant compound AI system. Since the output from LLMs propagate to downstream steps, identifying LLM errors is crucial to system performance. A common task for LLMs in AI systems is tool use. While there are several benchmark environments for evaluating LLMs on this task, they typically only give a success rate without any explanation of the failure cases. To solve this problem, we introduce SpecTool, a new benchmark to identify error patterns in LLM output on tool-use tasks. Our benchmark data set comprises of queries from diverse environments that can be used to test for the presence of seven newly characterized error patterns. Using SPECTOOL , we show that even the most prominent LLMs exhibit these error patterns in their outputs. Researchers can use the analysis and insights from SPECTOOL to guide their error mitigation strategies.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.13547

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.93)

Industry:

Media > Film (0.68)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

PRACT: Optimizing Principled Reasoning and Acting of LLM Agent

Liu, Zhiwei, Yao, Weiran, Zhang, Jianguo, Murthy, Rithesh, Yang, Liangwei, Liu, Zuxin, Lan, Tian, Zhu, Ming, Tan, Juntao, Kokane, Shirley, Hoang, Thai, Niebles, Juan Carlos, Heinecke, Shelby, Wang, Huan, Savarese, Silvio, Xiong, Caiming

arXiv.org Artificial IntelligenceOct-24-2024

We introduce the Principled Reasoning and Acting (PRAct) framework, a novel method for learning and enforcing action principles from trajectory data. Central to our approach is the use of text gradients from a reflection and optimization engine to derive these action principles. To adapt action principles to specific task requirements, we propose a new optimization framework, Reflective Principle Optimization (RPO). After execution, RPO employs a reflector to critique current action principles and an optimizer to update them accordingly. We develop the RPO framework under two scenarios: Reward-RPO, which uses environmental rewards for reflection, and Self-RPO, which conducts self-reflection without external rewards. Additionally, two RPO methods, RPO-Traj and RPO-Batch, is introduced to adapt to different settings. Experimental results across four environments demonstrate that the PRAct agent, leveraging the RPO framework, effectively learns and applies action principles to enhance performance.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2410.18528

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.97)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Add feedback

APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

Liu, Zuxin, Hoang, Thai, Zhang, Jianguo, Zhu, Ming, Lan, Tian, Kokane, Shirley, Tan, Juntao, Yao, Weiran, Liu, Zhiwei, Feng, Yihao, Murthy, Rithesh, Yang, Liangwei, Savarese, Silvio, Niebles, Juan Carlos, Wang, Huan, Heinecke, Shelby, Xiong, Caiming

arXiv.org Artificial IntelligenceJun-26-2024

The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications. We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets in a scalable and structured manner. Each data in our dataset is verified through three hierarchical stages: format checking, actual function executions, and semantic verification, ensuring its reliability and correctness. We demonstrate that models trained with our curated datasets, even with only 7B parameters, can achieve state-of-the-art performance on the Berkeley Function-Calling Benchmark, outperforming multiple GPT-4 models. Moreover, our 1B model achieves exceptional performance, surpassing GPT-3.5-Turbo and Claude-3 Haiku. We release a dataset containing 60,000 high-quality entries, aiming to advance the field of function-calling agent domains.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.18518

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Zhang, Jianguo, Lan, Tian, Murthy, Rithesh, Liu, Zhiwei, Yao, Weiran, Tan, Juntao, Hoang, Thai, Yang, Liangwei, Feng, Yihao, Liu, Zuxin, Awalgaonkar, Tulika, Niebles, Juan Carlos, Savarese, Silvio, Heinecke, Shelby, Wang, Huan, Xiong, Caiming

arXiv.org Artificial IntelligenceMar-20-2024

Autonomous agents powered by large language models (LLMs) have garnered significant research attention. However, fully harnessing the potential of LLMs for agent-based tasks presents inherent challenges due to the heterogeneous nature of diverse data sources featuring multi-turn trajectories. In this paper, we introduce AgentOhana as a comprehensive solution to address these challenges. Leveraging the data unification, our training pipeline maintains equilibrium across different data sources and preserves independent randomness across devices during dataset partitioning and model training. Additionally, we present xLAM-v0.1, a large action model tailored for AI agents, which demonstrates exceptional performance across various benchmarks. Large language models (LLMs) have shown strong abilities in code generation, mathematical reasoning, conversational AI, and AI agents (OpenAI, 2023; Jiang et al., 2023; Zhang et al., 2023; Liu et al., 2023a; Nijkamp et al., 2023). Among these, LLM-powered autonomous agents are gaining increasing attention.

large language model, machine learning, trajectory, (19 more...)

arXiv.org Artificial Intelligence

2402.15506

Country: North America > United States (0.28)

Genre:

Research Report (0.82)
Workflow (0.71)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback