AITopics | action sequence

Collaborating Authors

action sequence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving planning and MBRL with temporally-extended actions

Neural Information Processing SystemsJun-22-2026, 19:56:34 GMT

Continuous time systems are often modeled using discrete time dynamics but this requires a small simulation step to maintain accuracy. In turn, this requires a large planning horizon which leads to computationally demanding planning problems and reduced performance. Previous work in model-free reinforcement learning has partially addressed this issue using action repeats where a policy is learned to determine a discrete action duration. Instead we propose to control the continuous decision timescale directly by using temporally-extended actions and letting the planner treat the duration of the action as an additional optimization variable along with the standard action variables. This additional structure has multiple advantages. It speeds up simulation time of trajectories and, importantly, it allows for deep horizon search in terms of primitive actions while using a shallow search depth in the planner. In addition, in the model-based reinforcement learning (MBRL) setting, it reduces compounding errors from model learning and improves training time for models. We show that this idea is effective and that the range for action durations can be automatically selected using a multi-armed bandit formulation and integrated into the MBRL framework. An extensive experimental evaluation both in planning and in MBRL, shows that our approach yields faster planning, better solutions, and that it enables solutions to problems that are not solved in the standard formulation.

data mining, machine learning, reinforcement learning, (21 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.66)

Add feedback

Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Reinforcement Learning

Neural Information Processing SystemsJun-20-2026, 18:59:34 GMT

Predicting a sequence of actions has been crucial in the success of recent behavior cloning algorithms in robotics. Can similar ideas improve reinforcement learning (RL)? We answer affirmatively by observing that incorporating action sequences when predicting ground-truth return-to-go leads to lower validation loss. Motivated by this, we introduce Coarse-to-fine Q-Network with Action Sequence (CQN-AS), a novel value-based RL algorithm that learns a critic network that outputs Q-values over a sequence of actions, i.e., explicitly training the value function to learn the consequence of executing action sequences. Our experiments show that CQN-AS outperforms several baselines on a variety of sparse-reward humanoid control and tabletop manipulation tasks from BiGym and RLBench. Code is available at: https://younggyo.me/cqn-as/

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Genre:

Workflow (1.00)
Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Fairness-Regularized Online Optimization with Switching Costs

Neural Information Processing SystemsJun-15-2026, 09:50:52 GMT

Fairness and action smoothness are two crucial considerations in many online optimization problems, but they have yet to be addressed simultaneously. In this paper, we study a new and challenging setting of fairness-regularized smoothed online convex optimization with switching costs. First, to highlight the fundamental challenges introduced by the long-term fairness regularizer evaluated based on the entire sequence of actions, we prove that even without switching costs, no online algorithms can possibly achieve a sublinear regret or finite competitive ratio compared to the offline optimal algorithm as the problem episode length T increases. Then, we propose FairOBD(Fairness-regularized Online Balanced Descent), which reconciles the tension between minimizing the hitting cost, switching cost, and fairness cost.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Energy > Power Industry (1.00)
Information Technology > Services (0.94)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

Coarse-to-fine Q-Network with Action Sequence for Data-Efficient Reinforcement Learning

Neural Information Processing SystemsJun-13-2026, 13:32:39 GMT

artificial intelligence, machine learning, reinforcement learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

ASimple Decentralized Cross-Entropy Method

Neural Information Processing SystemsApr-28-2026, 07:27:09 GMT

Cross-Entropy Method (CEM) is commonly used for planning in model-based reinforcement learning (MBRL) where a centralized approach is typically utilized to update the sampling distribution based on only the top-k operations' results on samples. In this paper, we show that such a centralized approach makes CEM vulnerable to local optima, thus impairing its sample efficiency. To tackle this issue, we propose Decentralized CEM (DecentCEM), a simple but effective improvement over classical CEM, by using an ensemble of CEM instances running independently from one another, and each performing a local improvement of its own sampling distribution. We provide both theoretical and empirical analysis to demonstrate the effectiveness of this simple decentralized approach. We empirically show that, compared to the classical centralized approach using either a single or even a mixture of Gaussian distributions, our DecentCEM finds the global optimum much more consistently thus improves the sample efficiency. Furthermore, we plug in our DecentCEM in the planning problem of MBRL, and evaluate our approach in several continuous control environments, with comparison to the stateof-art CEM based MBRL approaches (PETS and POPLIN). Results show sample efficiency improvement by simply replacing the classical CEM module with our DecentCEM module, while only sacrificing a reasonable amount of computational cost. Lastly, we conduct ablation studies for more in-depth analysis.

artificial intelligence, machine learning, policy network, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

09ae6beae5f1ff38f05c05979097ea0f-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 14:14:41 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.28)

Genre:

Workflow (0.70)
Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Leisure & Entertainment > Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Bridging the Gap Between Value and Policy Based Reinforcement Learning

Neural Information Processing SystemsMar-17-2026, 18:55:49 GMT

We establish a new connection between value and policy based reinforcement learning (RL) based on a relationship between softmax temporal value consistency and policy optimality under entropy regularization. Specifically, we show that softmax consistent action values correspond to optimal entropy regularized policy probabilities along any action sequence, regardless of provenance. From this observation, we develop a new RL algorithm, Path Consistency Learning (PCL), that minimizes a notion of soft consistency error along multi-step action sequences extracted from both on-and off-policy traces. We examine the behavior of PCL in different scenarios and show that PCL can be interpreted as generalizing both actor-critic and Q-learning algorithms. We subsequently deepen the relationship by showing how a single model can be used to represent both a policy and the corresponding softmax state values, eliminating the need for a separate critic. The experimental evaluation demonstrates that PCL significantly outperforms strong actor-critic and Q-learning baselines across several benchmarks.

artificial intelligence, machine learning, reinforcement learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Dialog-to-Action: Conversational Question Answering Over a Large-Scale Knowledge Base

Daya Guo, Duyu Tang, Nan Duan, Ming Zhou, Jian Yin

Neural Information Processing SystemsMar-16-2026, 09:41:56 GMT

We present an approach to map utterances in conversation to logical forms, which will be executed on a large-scale knowledge base. To handle enormous ellipsis phenomena in conversation, we introduce dialog memory management to manipulate historical entities, predicates, and logical forms when inferring the logical form of current utterances. Dialog memory management is embodied in a generative model, in which a logical form is interpreted in a top-down manner following a small and flexible grammar. We learn the model from denotations without explicit annotation of logical forms, and evaluate it on a large-scale dataset consisting of 200K dialogs over 12.8M entities. Results verify the benefits of modeling dialog memory, and show that our semantic parsing-based approach outperforms a memory network based encoder-decoder model by a huge margin.

machine learning, natural language, question answering, (18 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Industry: Government (0.47)

Technology: