AITopics | universal goal

The paper presents a semi-parameteric model for long-term planning in a general space of problems. It works by training parametric goal-conditioned policies accurate only on local distances (i.e. when current state and goal are within some distance threshold) and leveraging the replay buffer to non-parametrically sample a graph of landmarks which the local goal-conditioned policy can accurately produce paths between. Moving to any goal state is then accomplished by (1) moving to the closest landmark using the goal-conditioned policy, (2) planning a path to the landmark closest to the goal using value-iteration on the (low-dimensional) graph of landmarks, (3) using the goal-conditioned policy to get to the goal state from the closest landmark. The paper essentially tackles the problem that goal-conditioned policies, or Universal Value Function Approximators (UVFA), degrade substantially in performance as the planning horizon increases. By leveraging the replay buffer to provide way-points for the algorithm to plan locally along, accuracy over longer ranges is maintained.

goal-conditioned policy, navigation, replay buffer, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.51)
Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Reviews: Mapping State Space using Landmarks for Universal Goal Reaching

Neural Information Processing SystemsJan-23-2025, 01:47:04 GMT

Reviewers liked the approach of combining local navigation using a UVFA trained by HER with global planning based on shortest paths in a graph constructed from a buffer of landmarks. At the same time, reviewers had some concerns regarding clarity of presentation, the similarity of the proposed approach to existing work (namely "Semi-parametric topological memory for navigation" by Savinov et al.), as well as how specific the proposed algorithm is to navigation problems. The rebuttal provided additional experiments on several new domains and included additional discussion of related work, leading one reviewer to raise the score. In the end the ACs found this work to be sufficiently new and promising to warrant acceptance, but we ask the authors to 1) address the concerns regarding related work including Savinov et al. and 2) include a clearer statement of the full approach (an algorithm block would be great) in the camera ready version.

artificial intelligence, mapping state space, universal goal, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.40)

Add feedback

Mapping State Space using Landmarks for Universal Goal Reaching

Neural Information Processing SystemsOct-9-2024, 20:59:38 GMT

An agent that has well understood the environment should be able to apply its skills for any given goals, leading to the fundamental problem of learning the Universal Value Function Approximator (UVFA). A UVFA learns to predict the cumulative rewards between all state-goal pairs. However, empirically, the value function for long-range goals is always hard to estimate and may consequently result in failed policy. This has presented challenges to the learning process and the capability of neural networks. We propose a method to address this issue in large MDPs with sparse rewards, in which exploration and routing across remote states are both extremely challenging.

landmark, mapping state space, universal goal, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Semantic-guided Prompt Organization for Universal Goal Hijacking against LLMs

Huang, Yihao, Wang, Chong, Jia, Xiaojun, Guo, Qing, Juefei-Xu, Felix, Zhang, Jian, Pu, Geguang, Liu, Yang

arXiv.org Artificial IntelligenceMay-23-2024

Abstract--With the rising popularity of Large Language Models (LLMs), assessing their trustworthiness through security tasks has gained critical importance. Regarding the new task of universal goal hijacking, previous efforts have concentrated solely on optimization algorithms, overlooking the crucial role of the prompt. To fill this gap, we propose a universal goal hijacking method called POUGH that incorporates semantic-guided prompt processing strategies. Specifically, the method starts with a sampling strategy to select representative prompts from a candidate pool, followed by a ranking strategy that prioritizes the prompts. Once the prompts are organized sequentially, the method employs an iterative optimization algorithm to generate the universal fixed suffix for the prompts. Experiments conducted on four popular LLMs and ten types of target responses verified the effectiveness of our method.

dataset, target response, universal goal, (15 more...)

arXiv.org Artificial Intelligence

2405.14189

Country:

Asia > Singapore (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > New York (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Consumer Health (1.00)
Government (1.00)
Law Enforcement & Public Safety > Terrorism (0.85)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Example-Driven Intent Prediction with Observers

Mehri, Shikib, Eric, Mihail, Hakkani-Tur, Dilek

arXiv.org Artificial IntelligenceOct-16-2020

A key challenge of dialog systems research is to effectively and efficiently adapt to new domains. A scalable paradigm for adaptation necessitates the development of generalizable models that perform well in few-shot settings. In this paper, we focus on the intent classification problem which aims to identify user intents given utterances addressed to the dialog system. We propose two approaches for improving the generalizability of utterance classification models: (1) example-driven training and (2) observers. Example-driven training learns to classify utterances by comparing to examples, thereby using the underlying encoder as a sentence similarity model. Prior work has shown that BERT-like models tend to attribute a significant amount of attention to the [CLS] token, which we hypothesize results in diluted representations. Observers are tokens that are not attended to, and are an alternative to the [CLS] token. The proposed methods attain state-of-the-art results on three intent prediction datasets (Banking, Clinc}, and HWU) in both the full data and few-shot (10 examples per intent) settings. Furthermore, we demonstrate that the proposed approach can transfer to new intents and across datasets without any additional training.

arxiv preprint arxiv, observer, representation, (13 more...)

arXiv.org Artificial Intelligence

2010.08684

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Lithuania > Vilnius County > Vilnius (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.55)

Add feedback

Mapping State Space using Landmarks for Universal Goal Reaching

Huang, Zhiao, Liu, Fangchen, Su, Hao

Neural Information Processing SystemsJan-11-2020, 09:51:37 GMT

An agent that has well understood the environment should be able to apply its skills for any given goals, leading to the fundamental problem of learning the Universal Value Function Approximator (UVFA). A UVFA learns to predict the cumulative rewards between all state-goal pairs. However, empirically, the value function for long-range goals is always hard to estimate and may consequently result in failed policy. This has presented challenges to the learning process and the capability of neural networks. We propose a method to address this issue in large MDPs with sparse rewards, in which exploration and routing across remote states are both extremely challenging.

landmark, mapping state space, universal goal, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Mapping State Space using Landmarks for Universal Goal Reaching

Huang, Zhiao, Liu, Fangchen, Su, Hao

Neural Information Processing SystemsDec-31-2019

An agent that has well understood the environment should be able to apply its skills for any given goals, leading to the fundamental problem of learning the Universal Value Function Approximator (UVFA). A UVFA learns to predict the cumulative rewards between all state-goal pairs. However, empirically, the value function for long-range goals is always hard to estimate and may consequently result in failed policy. This has presented challenges to the learning process and the capability of neural networks. We propose a method to address this issue in large MDPs with sparse rewards, in which exploration and routing across remote states are both extremely challenging. Our method explicitly models the environment in a hierarchical manner, with a high-level dynamic landmark-based map abstracting the visited state space, and a low-level value network to derive precise local decisions. We use farthest point sampling to select landmark states from past experience, which has improved exploration compared with simple uniform sampling. Experimentally we showed that our method enables the agent to reach long-range goals at the early training stage, and achieve better performance than standard RL algorithms for a number of challenging tasks.

algorithm, landmark, state space, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > New York (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Mapping State Space using Landmarks for Universal Goal Reaching

Huang, Zhiao, Liu, Fangchen, Su, Hao

arXiv.org Machine LearningAug-15-2019

An agent that has well understood the environment should be able to apply its skills for any given goals, leading to the fundamental problem of learning the Universal Value Function Approximator (UVFA). A UVFA learns to predict the cumulative rewards between all state-goal pairs. However, empirically, the value function for long-range goals is always hard to estimate and may consequently result in failed policy. This has presented challenges to the learning process and the capability of neural networks. We propose a method to address this issue in large MDPs with sparse rewards, in which exploration and routing across remote states are both extremely challenging. Our method explicitly models the environment in a hierarchical manner, with a high-level dynamic landmark-based map abstracting the visited state space, and a low-level value network to derive precise local decisions. We use farthest point sampling to select landmark states from past experience, which has improved exploration compared with simple uniform sampling. Experimentally we showed that our method enables the agent to reach long-range goals at the early training stage, and achieve better performance than standard RL algorithms for a number of challenging tasks.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Machine Learning

1908.05451

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Filters

Collaborating Authors

universal goal

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Mapping State Space using Landmarks for Universal Goal Reaching

Mapping State Space using Landmarks for Universal Goal Reaching

Reviews: Mapping State Space using Landmarks for Universal Goal Reaching

Reviews: Mapping State Space using Landmarks for Universal Goal Reaching

Mapping State Space using Landmarks for Universal Goal Reaching

Semantic-guided Prompt Organization for Universal Goal Hijacking against LLMs

Example-Driven Intent Prediction with Observers

Mapping State Space using Landmarks for Universal Goal Reaching

Mapping State Space using Landmarks for Universal Goal Reaching

Mapping State Space using Landmarks for Universal Goal Reaching