AITopics

Country: Asia (0.28)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.67)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Neural Information Processing SystemsJun-16-2026, 19:43:55 GMT

Group-in-Group Policy Optimization for LLMAgent Training

large language model, machine learning, natural language, (17 more...)

Country: North America > United States > California (0.28)

Genre:

Workflow (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.92)
Media (0.69)
Education (0.67)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsFeb-18-2026, 00:02:53 GMT

Appendix for Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions

In table 3, we demonstrate the prompt for textualized recaptioning.

artificial intelligence, natural language, original description, (14 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.71)

Neural Information Processing SystemsFeb-8-2026, 20:52:56 GMT

SupplementaryMaterialfor HandMeThat: Human-RobotCommunication inPhysicalandSocialEnvironments

In Section B, we summarize the statistics of the dataset. A.1 ObjectSpace Recall that HandMeThat uses an object-centric representation for states. Object hierarchy.HandMeThat classifies all categories into 5classes: location, receptacle, food, tool,andthing. Each class (except for"location") iscomposed ofmultiple subclasses, and each subclass contains several object categories. Intotal, there are155 object categories.

artificial intelligence, loc-location action, refrigerator, (15 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

arXiv.org Artificial IntelligenceNov-19-2025

SkillGen: Learning Domain Skills for In-Context Sequential Decision Making

Ding, Ruomeng, Cheng, Wei, Shao, Minglai, Zhao, Chen

Large language models (LLMs) are increasingly applied to sequential decision-making through in-context learning (ICL), yet their effectiveness is highly sensitive to prompt quality. Effective prompts should meet three principles: focus on decision-critical information, provide step-level granularity, and minimize reliance on expert annotations through label efficiency. However, existing ICL methods often fail to satisfy all three criteria simultaneously. Motivated by these challenges, we introduce SkillGen, a skill-based ICL framework for structured sequential reasoning. It constructs an action-centric, domain-level graph from sampled trajectories, identifies high-utility actions via temporal-difference credit assignment, and retrieves step-wise skills to generate fine-grained, context-aware prompts. We further present a theoretical analysis showing that focusing on high-utility segments supports task identifiability and informs more effective ICL prompt design. Experiments on ALFWorld, BabyAI, and ScienceWorld, using both open-source and proprietary LLMs, show that SkillGen achieves consistent gains, improving progress rate by 5.9%-16.5% on average across models.

large language model, machine learning, reinforcement learning, (21 more...)

2511.1467

Country: Europe > Austria (0.27)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

arXiv.org Artificial IntelligenceOct-29-2025

Group-in-Group Policy Optimization for LLM Agent Training

Feng, Lang, Xue, Zhenghai, Liu, Tingcong, An, Bo

Recent advances in group-based reinforcement learning (RL) have driven frontier large language models (LLMs) in single-turn tasks like mathematical reasoning. However, their scalability to multi-turn LLM agent training remains limited. Unlike static tasks, agent-environment interactions unfold over many steps and often yield sparse or delayed rewards, making credit assignment across individual steps significantly more challenging. In this work, we propose Group-in-Group Policy Optimization (GiGPO), a novel RL algorithm that achieves fine-grained credit assignment for LLM agents while preserving the appealing properties of group-based RL: critic-free, low memory, and stable convergence. GiGPO introduces a two-level structure for estimating relative advantage: (i) At the episode-level, GiGPO computes macro relative advantages based on groups of complete trajectories; (ii) At the step-level, GiGPO introduces an anchor state grouping mechanism that retroactively constructs step-level groups by identifying repeated environment states across trajectories. Actions stemming from the same state are grouped together, enabling micro relative advantage estimation. This hierarchical structure effectively captures both global trajectory quality and local step effectiveness without relying on auxiliary models or additional rollouts. We evaluate GiGPO on challenging agent benchmarks, including ALFWorld and WebShop, as well as tool-integrated reasoning on search-augmented QA tasks, using Qwen2.5-1.5B/3B/7B-Instruct. Crucially, GiGPO delivers fine-grained per-step credit signals, achieves performance gains of > 12% on ALFWorld and > 9% on WebShop over GRPO, and obtains superior performance on QA tasks (42.1% on 3B and 47.2% on 7B): all while maintaining the same GPU memory overhead, identical LLM rollout, and incurring little to no additional time cost.

large language model, machine learning, natural language, (16 more...)

2505.10978

Country: North America > United States > California (0.28)

Genre: Workflow (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-10-2025, 15:54:41 GMT

Appendix for Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions

In table 3, we demonstrate the prompt for textualized recaptioning.

original description, relative size proportion, relative spatial positioning, (12 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.71)

Singh, Shivam, Swaminathan, Karthik, Dash, Nabanita, Singh, Ramandeep, Banerjee, Snehasis, Sridharan, Mohan, Krishna, Madhava

AdaptBot: Combining LLM with Knowledge Graphs and Human Input for Generic-to-Specific Task Decomposition and Knowledge Refinement

arXiv.org Artificial IntelligenceFeb-4-2025

Embodied agents assisting humans are often asked to complete a new task in a new scenario. An agent preparing a particular dish in the kitchen based on a known recipe may be asked to prepare a new dish or to perform cleaning tasks in the storeroom. There may not be sufficient resources, e.g., time or labeled examples, to train the agent for these new situations. Large Language Models (LLMs) trained on considerable knowledge across many domains are able to predict a sequence of abstract actions for such new tasks and scenarios, although it may not be possible for the agent to execute this action sequence due to task-, agent-, or domain-specific constraints. Our framework addresses these challenges by leveraging the generic predictions provided by LLM and the prior domain-specific knowledge encoded in a Knowledge Graph (KG), enabling an agent to quickly adapt to new tasks and scenarios. The robot also solicits and uses human input as needed to refine its existing knowledge. Based on experimental evaluation over cooking and cleaning tasks in simulation domains, we demonstrate that the interplay between LLM, KG, and human input leads to substantial performance gains compared with just using the LLM output.

artificial intelligence, large language model, natural language, (19 more...)

2502.02067

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > India > Telangana > Hyderabad (0.04)

Genre: Research Report (0.52)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

EngadgetOct-17-2024, 14:00:50 GMT

A 105,000 robot arm nobody needs cooked me a delicious lunch

London's W1 is somewhere to go if you've got too much money to spend on something. Within minutes of each other, you can visit the city's priciest private doctor, buy a Steinway and a pair of designer glasses that cost more than my mortgage. Wigmore Street is also where the ultra rich go to buy a kitchen that Thorstein Veblen would weep at the sight of. It's also the new home of Moley Robotics, a company selling luxury kitchens and the robot arm that'll kinda/sorta do all of the cooking for you, too. Moley is the brainchild of Dr. Mark Oleynik and is one part kitchen showroom and one part robot lab. It's a spartan space with three demo kitchens, a wide dining table and some display units showing you the different types of artisan marble you can have for your countertop.

artificial intelligence, oleynik, robot, (15 more...)

Engadget

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.61)

arXiv.org Artificial IntelligenceOct-8-2024

ConceptAgent: LLM-Driven Precondition Grounding and Tree Search for Robust Task Planning and Execution

Rivera, Corban, Byrd, Grayson, Paul, William, Feldman, Tyler, Booker, Meghan, Holmes, Emma, Handelman, David, Kemp, Bethany, Badger, Andrew, Schmidt, Aurora, Jatavallabhula, Krishna Murthy, de Melo, Celso M, Seenivasan, Lalithkumar, Unberath, Mathias, Chellappa, Rama

Robotic planning and execution in open-world environments is a complex problem due to the vast state spaces and high variability of task embodiment. Recent advances in perception algorithms, combined with Large Language Models (LLMs) for planning, offer promising solutions to these challenges, as the common sense reasoning capabilities of LLMs provide a strong heuristic for efficiently searching the action space. However, prior work fails to address the possibility of hallucinations from LLMs, which results in failures to execute the planned actions largely due to logical fallacies at high- or low-levels. To contend with automation failure due to such hallucinations, we introduce ConceptAgent, a natural language-driven robotic platform designed for task execution in unstructured environments. With a focus on scalability and reliability of LLM-based planning in complex state and action spaces, we present innovations designed to limit these shortcomings, including 1) Predicate Grounding to prevent and recover from infeasible actions, and 2) an embodied version of LLM-guided Monte Carlo Tree Search with self reflection. In simulation experiments, ConceptAgent achieved a 19% task completion rate across three room layouts and 30 easy level embodied tasks outperforming other state-of-the-art LLM-driven reasoning baselines that scored 10.26% and 8.11% on the same benchmark. Additionally, ablation studies on moderate to hard embodied tasks revealed a 20% increase in task completion from the baseline agent to the fully enhanced ConceptAgent, highlighting the individual and combined contributions of Predicate Grounding and LLM-guided Tree Search to enable more robust automation in complex state and action spaces.

action input, countertop, object action input, (13 more...)

2410.06108

Genre: Research Report > Promising Solution (0.34)

Industry:

Energy (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)