AITopics | goal distribution

Collaborating Authors

goal distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Goal-Conditioned On-Policy Reinforcement Learning Xudong Gong

Neural Information Processing SystemsNov-18-2025, 01:32:37 GMT

This limitation prevents HER from densifying the reward.

demonstration, gcpo, learning, (14 more...)

Neural Information Processing Systems

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Hunan Province (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

51c6e143b5da2bd6e4a618d8a5d7f38b-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 02:35:20 GMT

demonstration, gcpo, learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Hunan Province (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

f-Policy Gradients: A General Framework for Goal Conditioned RL using f-Divergences

Neural Information Processing SystemsOct-8-2025, 07:59:56 GMT

We derive gradients for various f-divergences to optimize this objective.

agent, gradient, state-visitation distribution, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

DefFusionNet: Learning Multimodal Goal Shapes for Deformable Object Manipulation via a Diffusion-based Probabilistic Model

Thach, Bao, Kim, Siyeon, Jordan, Britton, Shanthi, Mohanraj, Watts, Tanner, Ho, Shing-Hei, Ferguson, James M., Hermans, Tucker, Kuntz, Alan

arXiv.org Artificial IntelligenceJun-24-2025

Deformable object manipulation is critical to many real-world robotic applications, ranging from surgical robotics and soft material handling in manufacturing to household tasks like laundry folding. At the core of this important robotic field is shape servoing, a task focused on controlling deformable objects into desired shapes. The shape servoing formulation requires the specification of a goal shape. However, most prior works in shape servoing rely on impractical goal shape acquisition methods, such as laborious domain-knowledge engineering or manual manipulation. DefGoalNet previously posed the current state-of-the-art solution to this problem, which learns deformable object goal shapes directly from a small number of human demonstrations. However, it significantly struggles in multi-modal settings, where multiple distinct goal shapes can all lead to successful task completion. As a deterministic model, DefGoalNet collapses these possibilities into a single averaged solution, often resulting in an unusable goal. In this paper, we address this problem by developing DefFusionNet, a novel neural network that leverages the diffusion probabilistic model to learn a distribution over all valid goal shapes rather than predicting a single deterministic outcome. This enables the generation of diverse goal shapes and avoids the averaging artifacts. We demonstrate our method's effectiveness on robotic tasks inspired by both manufacturing and surgical applications, both in simulation and on a physical robot. Our work is the first generative model capable of producing a diverse, multi-modal set of deformable object goals for real-world robotic applications.

artificial intelligence, machine learning, point cloud, (19 more...)

arXiv.org Artificial Intelligence

2506.18779

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Surgery (0.94)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SDP Synthesis of Maximum Coverage Trees for Probabilistic Planning under Control Constraints

Aggarwal, Naman, How, Jonathan P.

arXiv.org Artificial IntelligenceMar-21-2024

The paper presents Maximal Covariance Backward Reachable Trees (MAXCOVAR BRT), which is a multi-query algorithm for planning of dynamic systems under stochastic motion uncertainty and constraints on the control input with explicit coverage guarantees. In contrast to existing roadmap-based probabilistic planning methods that sample belief nodes randomly and draw edges between them \cite{csbrm_tro2024}, under control constraints, the reachability of belief nodes needs to be explicitly established and is determined by checking the feasibility of a non-convex program. Moreover, there is no explicit consideration of coverage of the roadmap while adding nodes and edges during the construction procedure for the existing methods. Our contribution is a novel optimization formulation to add nodes and construct the corresponding edge controllers such that the generated roadmap results in provably maximal coverage under control constraints as compared to any other method of adding nodes and edges. We characterize formally the notion of coverage of a roadmap in this stochastic domain via introduction of the h-$\operatorname{BRS}$ (Backward Reachable Set of Distributions) of a tree of distributions under control constraints, and also support our method with extensive simulations on a 6 DoF model.

constraint, control sequence, node, (16 more...)

arXiv.org Artificial Intelligence

2403.14605

Country: North America > United States > Massachusetts (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.46)

Add feedback

Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement Learning

Wu, Lisheng, Chen, Ke

arXiv.org Artificial IntelligenceDec-19-2023

Reinforcement learning (RL) often struggles to accomplish a sparse-reward long-horizon task in a complex environment. Goal-conditioned reinforcement learning (GCRL) has been employed to tackle this difficult problem via a curriculum of easy-to-reach sub-goals. In GCRL, exploring novel sub-goals is essential for the agent to ultimately find the pathway to the desired goal. How to explore novel sub-goals efficiently is one of the most challenging issues in GCRL. Several goal exploration methods have been proposed to address this issue but still struggle to find the desired goals efficiently. In this paper, we propose a novel learning objective by optimizing the entropy of both achieved and new goals to be explored for more efficient goal exploration in sub-goal selection based GCRL. To optimize this objective, we first explore and exploit the frequently occurring goal-transition patterns mined in the environments similar to the current task to compose skills via skill learning. Then, the pretrained skills are applied in goal exploration. Evaluation on a variety of spare-reward long-horizon benchmark tasks suggests that incorporating our method into several state-of-the-art GCRL baselines significantly boosts their exploration efficiency while improving or maintaining their performance. The source code is available at: https://github.com/GEAPS/GEAPS.

agent, exploration, goal exploration augmentation, (13 more...)

arXiv.org Artificial Intelligence

2210.16058

Country:

Europe > United Kingdom (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Denoising Heat-inspired Diffusion with Insulators for Collision Free Motion Planning

Chang, Junwoo, Ryu, Hyunwoo, Kim, Jiwoo, Yoo, Soochul, Choi, Jongeun, Seo, Joohwan, Prakash, Nikhil, Horowitz, Roberto

arXiv.org Artificial IntelligenceDec-9-2023

Diffusion models have risen as a powerful tool in robotics due to their flexibility and multi-modality. While some of these methods effectively address complex problems, they often depend heavily on inference-time obstacle detection and require additional equipment. Addressing these challenges, we present a method that, during inference time, simultaneously generates only reachable goals and plans motions that avoid obstacles, all from a single visual input. Central to our approach is the novel use of a collision-avoiding diffusion kernel for training. Through evaluations against behavior-cloning and classical diffusion models, our framework has proven its robustness. It is particularly effective in multi-modal environments, navigating toward goals and avoiding unreachable ones blocked by obstacles, while ensuring collision avoidance.

arxiv preprint arxiv, diffusion kernel, diffusion model, (10 more...)

arXiv.org Artificial Intelligence

2310.12609

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > India (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.94)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.88)

Add feedback

Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills

Kim, Seongun, Lee, Kyowoon, Choi, Jaesik

arXiv.org Artificial IntelligenceOct-30-2023

Mutual information-based reinforcement learning (RL) has been proposed as a promising framework for retrieving complex skills autonomously without a task-oriented reward function through mutual information (MI) maximization or variational empowerment. However, learning complex skills is still challenging, due to the fact that the order of training skills can largely affect sample efficiency. Inspired by this, we recast variational empowerment as curriculum learning in goal-conditioned RL with an intrinsic reward function, which we name Variational Curriculum RL (VCRL). From this perspective, we propose a novel approach to unsupervised skill discovery based on information theory, called Value Uncertainty Variational Curriculum (VUVC). We prove that, under regularity conditions, VUVC accelerates the increase of entropy in the visited states compared to the uniform curriculum. We validate the effectiveness of our approach on complex navigation and robotic manipulation tasks in terms of sample efficiency and state coverage speed. We also demonstrate that the skills discovered by our method successfully complete a real-world robot navigation task in a zero-shot setup and that incorporating these skills with a global planner further increases the performance.

goal distribution, unsupervised discovery, variational curriculum reinforcement learning, (10 more...)

arXiv.org Artificial Intelligence

2310.19424

Country: North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

$f$-Policy Gradients: A General Framework for Goal Conditioned RL using $f$-Divergences

Agarwal, Siddhant, Durugkar, Ishan, Stone, Peter, Zhang, Amy

arXiv.org Artificial IntelligenceOct-10-2023

Goal-Conditioned Reinforcement Learning (RL) problems often have access to sparse rewards where the agent receives a reward signal only when it has achieved the goal, making policy optimization a difficult problem. Several works augment this sparse reward with a learned dense reward function, but this can lead to sub-optimal policies if the reward is misaligned. Moreover, recent works have demonstrated that effective shaping rewards for a particular problem can depend on the underlying learning algorithm. This paper introduces a novel way to encourage exploration called $f$-Policy Gradients, or $f$-PG. $f$-PG minimizes the f-divergence between the agent's state visitation distribution and the goal, which we show can lead to an optimal policy. We derive gradients for various f-divergences to optimize this objective. Our learning paradigm provides dense learning signals for exploration in sparse reward settings. We further introduce an entropy-regularized policy optimization objective, that we call $state$-MaxEnt RL (or $s$-MaxEnt RL) as a special case of our objective. We show that several metric-based shaping rewards like L2 can be used with $s$-MaxEnt RL, providing a common ground to study such metric-based shaping rewards with efficient exploration. We find that $f$-PG has better performance compared to standard policy gradient methods on a challenging gridworld as well as the Point Maze and FetchReach environments. More information on our website https://agarwalsiddhant10.github.io/projects/fpg.html.

agent, gradient, state-visitation distribution, (13 more...)

arXiv.org Artificial Intelligence

2310.06794

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Filters

Collaborating Authors

goal distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

27f4d95417bb722201597bf4d67cbacc-Paper-Conference.pdf

Goal-Conditioned On-Policy Reinforcement Learning Xudong Gong

51c6e143b5da2bd6e4a618d8a5d7f38b-Paper-Conference.pdf

f-Policy Gradients: A General Framework for Goal Conditioned RL using f-Divergences

DefFusionNet: Learning Multimodal Goal Shapes for Deformable Object Manipulation via a Diffusion-based Probabilistic Model

SDP Synthesis of Maximum Coverage Trees for Probabilistic Planning under Control Constraints

Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement Learning

Denoising Heat-inspired Diffusion with Insulators for Collision Free Motion Planning

Variational Curriculum Reinforcement Learning for Unsupervised Discovery of Skills

$f$-Policy Gradients: A General Framework for Goal Conditioned RL using $f$-Divergences