AITopics | mountain car

Collaborating Authors

mountain car

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning

Harsh Gupta, R. Srikant, Lei Ying

Neural Information Processing SystemsFeb-14-2026, 18:11:57 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.05)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Environment Agnostic Goal-Conditioning, A Study of Reward-Free Autonomous Learning

Åström, Hampus, Topp, Elin Anna, Malec, Jacek

arXiv.org Artificial IntelligenceNov-7-2025

In this paper we study how transforming regular reinforcement learning environments into goal-conditioned environments can let agents learn to solve tasks autonomously and reward-free. We show that an agent can learn to solve tasks by selecting its own goals in an environment-agnostic way, at training times comparable to externally guided reinforcement learning. Our method is independent of the underlying off-policy learning algorithm. Since our method is environment-agnostic, the agent does not value any goals higher than others, leading to instability in performance for individual goals. However, in our experiments, we show that the average goal success rate improves and stabilizes. An agent trained with this method can be instructed to seek any observations made in the environment, enabling generic training of agents prior to specific use cases.

machine learning, reinforcement learning, selection, (15 more...)

arXiv.org Artificial Intelligence

2511.04598

Genre: Research Report > New Finding (0.34)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Add feedback

Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning

Harsh Gupta, R. Srikant, Lei Ying

Neural Information Processing SystemsAug-20-2025, 06:52:28 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, approximation, stochastic approximation algorithm, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.05)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Learning from Less: SINDy Surrogates in RL

Dixit, Aniket, Khan, Muhammad Ibrahim, Ahmed, Faizan, Brusey, James

arXiv.org Artificial IntelligenceApr-28-2025

This paper introduces an approach for developing surrogate environments in reinforcement learning (RL) using the Sparse Identification of Nonlinear Dynamics (SINDy) algorithm. We demonstrate the effectiveness of our approach through extensive experiments in OpenAI Gym environments, particularly Mountain Car and Lunar Lander. Our results show that SINDy-based surrogate models can accurately capture the underlying dynamics of these environments while reducing computational costs by 20-35%. With only 75 interactions for Mountain Car and 1000 for Lunar Lander, we achieve state-wise correlations exceeding 0.997, with mean squared errors as low as 3.11e-06 for Mountain Car velocity and 1.42e-06 for LunarLander position. RL agents trained in these surrogate environments require fewer total steps (65,075 vs. 100,000 for Mountain Car and 801,000 vs. 1,000,000 for Lunar Lander) while achieving comparable performance to those trained in the original environments, exhibiting similar convergence patterns and final performance metrics. This work contributes to the field of model-based RL by providing an efficient method for generating accurate, interpretable surrogate environments.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2504.18113

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)

Add feedback

Reviews: Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling

Neural Information Processing SystemsJan-23-2025, 15:22:17 GMT

Originality: The main idea of the paper - avoiding the long horizon problem by computing IS over state distributions rather than trajectories - was already introduced in (Liu et. However, the approach the authors take to leveraging this idea is original. Additionally, there is not yet enough published work on leveraging this potentially important idea (IS over state distribution), and therefore even being the second paper in this direction is still charting new territory. Quality - To the extent I looked at it the theoretical work is solid. I did not go over every equality in the proofs to check for algebraic errors, but I did go through every step in the proofs found in the appendix.

marginalized importance sampling, state distribution, trajectory, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Symbolic State Partitioning for Reinforcement Learning

Ghaffari, Mohsen, Varshosaz, Mahsa, Johnsen, Einar Broch, Wąsowski, Andrzej

arXiv.org Artificial IntelligenceOct-3-2024

Tabular reinforcement learning methods cannot operate directly on continuous state spaces. One solution for this problem is to partition the state space. A good partitioning enables generalization during learning and more efficient exploitation of prior experiences. Consequently, the learning process becomes faster and produces more reliable policies. However, partitioning introduces approximation, which is particularly harmful in the presence of nonlinear relations between state components. An ideal partition should be as coarse as possible, while capturing the key structure of the state space for the given problem. This work extracts partitions from the environment dynamics by symbolic execution. We show that symbolic partitioning improves state space coverage with respect to environmental behavior and allows reinforcement learning to perform better for sparse rewards. We evaluate symbolic state space partitioning with respect to precision, scalability, learning agent performance and state space coverage for the learnt policies.

partition, path condition, state space, (15 more...)

arXiv.org Artificial Intelligence

2409.16791

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Empirical Design in Reinforcement Learning

Patterson, Andrew, Neumann, Samuel, White, Martha, White, Adam

arXiv.org Artificial IntelligenceApr-3-2023

Empirical design in reinforcement learning is no small task. Running good experiments requires attention to detail and at times significant computational resources. While compute resources available per dollar have continued to grow rapidly, so have the scale of typical experiments in reinforcement learning. It is now common to benchmark agents with millions of parameters against dozens of tasks, each using the equivalent of 30 days of experience. The scale of these experiments often conflict with the need for proper statistical evidence, especially when comparing algorithms. Recent studies have highlighted how popular algorithms are sensitive to hyper-parameter settings and implementation details, and that common empirical practice leads to weak statistical evidence (Machado et al., 2018; Henderson et al., 2018). Here we take this one step further. This manuscript represents both a call to action, and a comprehensive resource for how to do good experiments in reinforcement learning. In particular, we cover: the statistical assumptions underlying common performance measures, how to properly characterize performance variation and stability, hypothesis testing, special considerations for comparing multiple agents, baseline and illustrative example construction, and how to deal with hyper-parameters and experimenter bias. Throughout we highlight common mistakes found in the literature and the statistical consequences of those in example experiments. The objective of this document is to provide answers on how we can use our unprecedented compute to do good science in reinforcement learning, as well as stay alert to potential pitfalls in our empirical design.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2304.01315

Country: North America > Canada > Alberta (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education (1.00)
Energy > Oil & Gas > Upstream (0.67)
Health & Medicine (0.65)
Leisure & Entertainment > Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Self-building Neural Networks

Ferigo, Andrea, Iacca, Giovanni

arXiv.org Artificial IntelligenceApr-3-2023

During the first part of life, the brain develops while it learns through a process called synaptogenesis. The neurons, growing and interacting with each other, create synapses. However, eventually the brain prunes those synapses. While previous work focused on learning and pruning independently, in this work we propose a biologically plausible model that, thanks to a combination of Hebbian learning and pruning, aims to simulate the synaptogenesis process. In this way, while learning how to solve the task, the agent translates its experience into a particular network structure. Namely, the network structure builds itself during the execution of the task. We call this approach Self-building Neural Network (SBNN). We compare our proposed SBNN with traditional neural networks (NNs) over three classical control tasks from OpenAI. The results show that our model performs generally better than traditional NNs. Moreover, we observe that the performance decay while increasing the pruning rate is smaller in our model than with NNs. Finally, we perform a validation test, testing the models over tasks unseen during the learning phase. In this case, the results show that SBNNs can adapt to new tasks better than the traditional NNs, especially when over $80\%$ of the weights are pruned.

artificial intelligence, machine learning, node, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3583133.3590531

2304.01086

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study > Negative Result (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Efficient Exploration in Resource-Restricted Reinforcement Learning

Wang, Zhihai, Pan, Taoxing, Zhou, Qi, Wang, Jie

arXiv.org Artificial IntelligenceDec-13-2022

In many real-world applications of reinforcement learning (RL), performing actions requires consuming certain types of resources that are non-replenishable in each episode. Typical applications include robotic control with limited energy and video games with consumable items. In tasks with non-replenishable resources, we observe that popular RL methods such as soft actor critic suffer from poor sample efficiency. The major reason is that, they tend to exhaust resources fast and thus the subsequent exploration is severely restricted due to the absence of resources. To address this challenge, we first formalize the aforementioned problem as a resource-restricted reinforcement learning, and then propose a novel resource-aware exploration bonus (RAEB) to make reasonable usage of resources. An appealing feature of RAEB is that, it can significantly reduce unnecessary resource-consuming trials while effectively encouraging the agent to explore unvisited states. Experiments demonstrate that the proposed RAEB significantly outperforms state-of-the-art exploration strategies in resource-restricted reinforcement learning environments, improving the sample efficiency by up to an order of magnitude.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2212.06988

Country: North America > United States (0.67)

Genre: Research Report (0.65)

Industry:

Leisure & Entertainment > Games (0.34)
Energy > Oil & Gas > Upstream (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Model-Based Reinforcement Learning with SINDy

Arora, Rushiv, da Silva, Bruno Castro, Moss, Eliot

arXiv.org Artificial IntelligenceAug-30-2022

We draw on the latest advancements in the physics community to propose a novel method for discovering the governing non-linear dynamics of physical systems in reinforcement learning (RL). We establish that this method is capable of discovering the underlying dynamics using significantly fewer trajectories (as little as one rollout with $\leq 30$ time steps) than state of the art model learning algorithms. Further, the technique learns a model that is accurate enough to induce near-optimal policies given significantly fewer trajectories than those required by model-free algorithms. It brings the benefits of model-based RL without requiring a model to be developed in advance, for systems that have physics-based dynamics. To establish the validity and applicability of this algorithm, we conduct experiments on four classic control tasks. We found that an optimal policy trained on the discovered dynamics of the underlying system can generalize well. Further, the learned policy performs well when deployed on the actual physical system, thus bridging the model to real system gap. We further compare our method to state-of-the-art model-based and model-free approaches, and show that our method requires fewer trajectories sampled on the true physical system compared other methods. Additionally, we explored approximate dynamics models and found that they also can perform well.

equation, experiment, sindy, (15 more...)

arXiv.org Artificial Intelligence

2208.14501

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report > Promising Solution (0.76)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback