AITopics | task domain

Collaborating Authors

task domain

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Massively Multitask World Models for Continuous Control

Hansen, Nicklas, Su, Hao, Wang, Xiaolong

arXiv.org Artificial IntelligenceDec-3-2025

General-purpose control demands agents that act across many tasks and embodiments, yet research on reinforcement learning (RL) for continuous control remains dominated by single-task or offline regimes, reinforcing a view that online RL does not scale. Inspired by the foundation model recipe (large-scale pretraining followed by light RL) we ask whether a single agent can be trained on hundreds of tasks with online interaction. To accelerate research in this direction, we introduce a new benchmark with 200 diverse tasks spanning many domains and embodiments, each with language instructions, demonstrations, and optionally image observations. We then present \emph{Newt}, a language-conditioned multitask world model that is first pretrained on demonstrations to acquire task-aware representations and action priors, and then jointly optimized with online interaction across all tasks. Experiments show that Newt yields better multitask performance and data-efficiency than a set of strong baselines, exhibits strong open-loop control, and enables rapid adaptation to unseen tasks. We release our environments, demonstrations, code for training and evaluation, as well as 200+ checkpoints.

demonstration, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2511.19584

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games > Computer Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Help or Hurdle? Rethinking Model Context Protocol-Augmented Large Language Models

Song, Wei, Zhong, Haonan, Ding, Ziqi, Xue, Jingling, Li, Yuekang

arXiv.org Artificial IntelligenceAug-19-2025

The Model Context Protocol (MCP) enables large language models (LLMs) to access external resources on demand. While commonly assumed to enhance performance, how LLMs actually leverage this capability remains poorly understood. We introduce MCPGAUGE, the first comprehensive evaluation framework for probing LLM-MCP interactions along four key dimensions: proactivity (self-initiated tool use), compliance (adherence to tool-use instructions), effectiveness (task performance post-integration), and overhead (computational cost incurred). MCPGAUGE comprises a 160-prompt suite and 25 datasets spanning knowledge comprehension, general reasoning, and code generation. Our large-scale evaluation, spanning six commercial LLMs, 30 MCP tool suites, and both one- and two-turn interaction settings, comprises around 20,000 API calls and over USD 6,000 in computational cost. This comprehensive study reveals four key findings that challenge prevailing assumptions about the effectiveness of MCP integration. These insights highlight critical limitations in current AI-tool integration and position MCPGAUGE as a principled benchmark for advancing controllable, tool-augmented LLMs.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.12566

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language

Chu, Kun, Zhao, Xufeng, Weber, Cornelius, Wermter, Stefan

arXiv.org Artificial IntelligenceMar-21-2025

Bimanual robotic manipulation provides significant versatility, but also presents an inherent challenge due to the complexity involved in the spatial and temporal coordination between two hands. Existing works predominantly focus on attaining human-level manipulation skills for robotic hands, yet little attention has been paid to task planning on long-horizon timescales. With their outstanding in-context learning and zero-shot generation abilities, Large Language Models (LLMs) have been applied and grounded in diverse robotic embodiments to facilitate task planning. However, LLMs still suffer from errors in long-horizon reasoning and from hallucinations in complex robotic tasks, lacking a guarantee of logical correctness when generating the plan. Previous works, such as LLM+P, extended LLMs with symbolic planners. However, none have been successfully applied to bimanual robots. New challenges inevitably arise in bimanual manipulation, necessitating not only effective task decomposition but also efficient task allocation. To address these challenges, this paper introduces LLM+MAP, a bimanual planning framework that integrates LLM reasoning and multi-agent planning, automating effective and efficient bimanual task planning. We conduct simulated experiments on various long-horizon manipulation tasks of differing complexity. Our method is built using GPT-4o as the backend, and we compare its performance against plans generated directly by LLMs, including GPT-4o, V3 and also recent strong reasoning models o1 and R1. By analyzing metrics such as planning time, success rate, group debits, and planning-step reduction rate, we demonstrate the superior performance of LLM+MAP, while also providing insights into robotic reasoning. Code is available at https://github.com/Kchu/LLM-MAP.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.17309

Country:

North America > United States (0.04)
Europe > Germany > Hamburg (0.04)
Europe > France (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Review for NeurIPS paper: Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Neural Information Processing SystemsJan-27-2025, 14:31:58 GMT

Strengths: The idea of formulating the inner loss for meta RL as learning from the objective discovered by its own is interesting and novel. Generally, defining the algorithm to self-discover its objective makes the learning algorithm moves one step closer towards developing automated machine intelligence compared to the conventional meta RL methods which greatly rely on expert's design choice such as the hyperparameter to perform learning-to-learn. The authors present extensive experiment results to evaluate the proposed method. The proposed method has been evaluated on three task domains: a catch game to demonstrate the method could effectively learn bootstrapping, a 5-state random walk to demonstrate the method works in non-stationary environments, and ALE which is a large-scale RL testbed. In all the task domains, the proposed method achieves noticeable performance improvement over the compared baselines.

meta-gradient reinforcement learning, neurips paper, objective discovered online, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Vision-Language Navigation with Continual Learning

Li, Zhiyuan, Lv, Yanfeng, Tu, Ziqin, Shang, Di, Qiao, Hong

arXiv.org Artificial IntelligenceSep-4-2024

Vision-language navigation (VLN) is a critical domain within embedded intelligence, requiring agents to navigate 3D environments based on natural language instructions. Traditional VLN research has focused on improving environmental understanding and decision accuracy. However, these approaches often exhibit a significant performance gap when agents are deployed in novel environments, mainly due to the limited diversity of training data. Expanding datasets to cover a broader range of environments is impractical and costly. We propose the Vision-Language Navigation with Continual Learning (VLNCL) paradigm to address this challenge. In this paradigm, agents incrementally learn new environments while retaining previously acquired knowledge. VLNCL enables agents to maintain an environmental memory and extract relevant knowledge, allowing rapid adaptation to new environments while preserving existing information. We introduce a novel dual-loop scenario replay method (Dual-SR) inspired by brain memory replay mechanisms integrated with VLN agents. This method facilitates consolidating past experiences and enhances generalization across new tasks. By utilizing a multi-scenario memory buffer, the agent efficiently organizes and replays task memories, thereby bolstering its ability to adapt quickly to new environments and mitigating catastrophic forgetting. Our work pioneers continual learning in VLN agents, introducing a novel experimental setup and evaluation metrics. We demonstrate the effectiveness of our approach through extensive evaluations and establish a benchmark for the VLNCL paradigm. Comparative experiments with existing continual learning and VLN methods show significant improvements, achieving state-of-the-art performance in continual learning ability and highlighting the potential of our approach in enabling rapid adaptation while preserving prior knowledge.

agent, continual learning, task domain, (12 more...)

arXiv.org Artificial Intelligence

2409.02561

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York (0.04)
North America > United States > Indiana > Madison County > Anderson (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.89)

Add feedback

PIPE: Process Informed Parameter Estimation, a learning based approach to task generalized system identification

Schempp, Constantin, Friedrich, Christian

arXiv.org Artificial IntelligenceJun-25-2024

We address the problem of robot guided assembly tasks, by using a learning-based approach to identify contact model parameters for known and novel parts. First, a Variational Autoencoder (VAE) is used to extract geometric features of assembly parts. Then, we combine the extracted features with physical knowledge to derive the parameters of a contact model using our newly proposed neural network structure. The measured force from real experiments is used to supervise the predicted forces, thus avoiding the need for ground truth model parameters. Although trained only on a small set of assembly parts, good contact model estimation for unknown objects were achieved. Our main contribution is the network structure that allows us to estimate contact models of assembly tasks depending on the geometry of the part to be joined. Where current system identification processes have to record new data for a new assembly process, our method only requires the 3D model of the assembly part. We evaluate our method by estimating contact models for robot-guided assembly tasks of pin connectors as well as electronic plugs and compare the results with real experiments.

assembly task, geometry, process information, (13 more...)

arXiv.org Artificial Intelligence

2405.06991

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Florida > Miami-Dade County > Miami Beach (0.04)
North America > Mexico > Quintana Roo > Cancún (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Online DPO: Online Direct Preference Optimization with Fast-Slow Chasing

Qi, Biqing, Li, Pengfei, Li, Fangyuan, Gao, Junqi, Zhang, Kaiyan, Zhou, Bowen

arXiv.org Artificial IntelligenceJun-8-2024

Direct Preference Optimization (DPO) improves the alignment of large language models (LLMs) with human values by training directly on human preference datasets, eliminating the need for reward models. However, due to the presence of cross-domain human preferences, direct continual training can lead to catastrophic forgetting, limiting DPO's performance and efficiency. Inspired by intraspecific competition driving species evolution, we propose a Online Fast-Slow chasing DPO (OFS-DPO) for preference alignment, simulating competition through fast and slow chasing among models to facilitate rapid adaptation. Specifically, we first derive the regret upper bound for online learning, validating our motivation with a min-max optimization pattern. Based on this, we introduce two identical modules using Low-rank Adaptive (LoRA) with different optimization speeds to simulate intraspecific competition, and propose a new regularization term to guide their learning. To further mitigate catastrophic forgetting in cross-domain scenarios, we extend the OFS-DPO with LoRA modules combination strategy, resulting in the Cross domain Online Fast-Slow chasing DPO (COFS-DPO). This method leverages linear combinations of fast modules parameters from different task domains, fully utilizing historical information to achive continual value alignment. Experimental results show that OFS-DPO outperforms DPO in in-domain alignment, while COFS-DPO excels in cross-domain continual learning scenarios.

learning, module, ofs-dpo, (14 more...)

arXiv.org Artificial Intelligence

2406.05534

Country: Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Overcoming Catastrophic Forgetting by Exemplar Selection in Task-oriented Dialogue System

Chen, Chen, Li, Ruizhe, Hu, Yuchen, Chen, Yuanyuan, Qin, Chengwei, Zhang, Qiang

arXiv.org Artificial IntelligenceMay-16-2024

Intelligent task-oriented dialogue systems (ToDs) are expected to continuously acquire new knowledge, also known as Continual Learning (CL), which is crucial to fit ever-changing user needs. However, catastrophic forgetting dramatically degrades the model performance in face of a long streamed curriculum. In this paper, we aim to overcome the forgetting problem in ToDs and propose a method (HESIT) with hyper-gradient-based exemplar strategy, which samples influential exemplars for periodic retraining. Instead of unilaterally observing data or models, HESIT adopts a profound exemplar selection strategy that considers the general performance of the trained model when selecting exemplars for each task domain. Specifically, HESIT analyzes the training data influence by tracing their hyper-gradient in the optimization process. Furthermore, HESIT avoids estimating Hessian to make it compatible for ToDs with a large pre-trained model. Experimental results show that HESIT effectively alleviates catastrophic forgetting by exemplar selection, and achieves state-of-the-art performance on the largest CL benchmark of ToDs in terms of all metrics.

exemplar, hesit, preprint arxiv, (15 more...)

arXiv.org Artificial Intelligence

2405.10992

Country:

Asia > Singapore (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Education (0.68)
Information Technology > Security & Privacy (0.46)
Transportation > Ground (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)

Add feedback

Filters

Collaborating Authors

task domain

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

ef0d17b3bdb4ee2aa741ba28c7255c53-Paper.pdf

Learning Massively Multitask World Models for Continuous Control

Help or Hurdle? Rethinking Model Context Protocol-Augmented Large Language Models

ef0d17b3bdb4ee2aa741ba28c7255c53-Paper.pdf

LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language

Review for NeurIPS paper: Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Vision-Language Navigation with Continual Learning

PIPE: Process Informed Parameter Estimation, a learning based approach to task generalized system identification

Online DPO: Online Direct Preference Optimization with Fast-Slow Chasing

Overcoming Catastrophic Forgetting by Exemplar Selection in Task-oriented Dialogue System