AITopics | model-free method

Collaborating Authors

model-free method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

a322852ce0df73e204b7e67cbbef0d0a-Paper.pdf

Neural Information Processing SystemsFeb-19-2026, 05:45:32 GMT

algorithm, arxiv preprint arxiv, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)
Asia > China > Shandong Province > Dongying (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

aaebdb8bb6b0e73f6c3c54a0ab0c6415-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-13-2026, 12:37:10 GMT

model-free method, theorem 2, value function, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

a322852ce0df73e204b7e67cbbef0d0a-Paper.pdf

Neural Information Processing SystemsOct-3-2025, 18:17:31 GMT

algorithm, arxiv preprint arxiv, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada (0.04)
Asia > China > Shandong Province > Dongying (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

15212f24321aa2c3dc8e9acf820f3c15-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 04:51:20 GMT

artificial intelligence, decay rate, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

Before we respond to specific comments, we want to address a high level concern which was brought up by several

Neural Information Processing SystemsAug-19-2025, 23:06:47 GMT

We hope that our comments in the first paragraph address the main concerns with our work.

high level concern, specific comment, value function, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

Reviews: Planning with Goal-Conditioned Policies

Neural Information Processing SystemsJan-27-2025, 00:42:26 GMT

Post rebuttal: My suggestions/comments were not addressed in the rebuttal, so I keep my score as is. Others have proposed this type of two step optimization where one first learns a compact representation with a VAE on randomly collected samples, then use various RL or planning methods on the representation. However, this doesn't work well for high dimensional spaces where random collection of data for learning the representation space does not give enough samples -- especially from the optimal policy. This work doesn't address this issue, by only evaluating on environments with very small state spaces, where random sampling to train the VAE is feasible. Originality: The idea of planning using TDMs over a latent representation is novel, and a promising direction for goal-directed planning in high-dimensional observation spaces.

goal-conditioned policy, model-free method, representation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.56)

Add feedback

Review for NeurIPS paper: Model-based Adversarial Meta-Reinforcement Learning

Neural Information Processing SystemsJan-25-2025, 18:45:11 GMT

Additional Feedback: After reading the other reviews and the authors' rebuttal, I have increased my score to 7. The additional experiments are greatly appreciated, but I think more details should be provided for them: e.g. I feel that if the policy has all the necessary information and is trained with a model-free approach, it should be able to obtain comparable or better result than a model-based approach (with much worse sample complexity, of course). That being said, the comparison between model-based and model-free methods is not the focus of the work and the experiments with model-based baselines do show good results. I think the paper presents an interesting idea for improving the robustness of model-based rl method to different reward functions. I have a few questions regarding the details of the algorithm, as listed below.

experiment, model-based adversarial meta-reinforcement learning, neurips paper, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

On Model-Free Re-ranking for Visual Place Recognition with Deep Learned Local Features

Pivoňka, Tomáš, Přeučil, Libor

arXiv.org Artificial IntelligenceOct-25-2024

Re-ranking is the second stage of a visual place recognition task, in which the system chooses the best-matching images from a pre-selected subset of candidates. Model-free approaches compute the image pair similarity based on a spatial comparison of corresponding local visual features, eliminating the need for computationally expensive estimation of a model describing transformation between images. The article focuses on model-free re-ranking based on standard local visual features and their applicability in long-term autonomy systems. It introduces three new model-free re-ranking methods that were designed primarily for deep-learned local visual features. These features evince high robustness to various appearance changes, which stands as a crucial property for use with long-term autonomy systems. All the introduced methods were employed in a new visual place recognition system together with the D2-net feature detector (Dusmanu, 2019) and experimentally tested with diverse, challenging public datasets. The obtained results are on par with current state-of-the-art methods, affirming that model-free approaches are a viable and worthwhile path for long-term visual place recognition.

dataset, descriptor, visual feature, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TIV.2024.3404564

2410.18573

Country:

Europe > Czechia > Prague (0.04)
Europe > Norway (0.04)

Genre: Research Report > Promising Solution (0.66)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(2 more...)

Add feedback

Reviews: Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

Neural Information Processing SystemsOct-7-2024, 08:43:36 GMT

This paper describes a model-based reinforcement learning approach which is applied on 4 of the continuous control Mujoco tasks. The approach incorporates uncertainty in the forward dynamics model in two ways: by predicting a Gaussian distribution over future states, rather than a single point, and by training an ensemble of models using different subsets of the agent's experience. As a controller, the authors use the CEM method to generate action sequences, which are then used to generate state trajectories using the stochastic forward dynamics model. Reward sums are computed for each of the action-conditional trajectories, and the action corresponding to the highest predicted reward is executed. This is thus a form of model-predictive control. In their experiments, the authors show that their method is able to match the performance of SOTA model-free approaches using many fewer environment interactions, i.e. with improved sample complexity, for 3 out of 4 tasks.

deep reinforcement learning, forward dynamic model, probabilistic dynamic model, (10 more...)

Neural Information Processing Systems

Genre:

Research Report (0.38)
Summary/Review (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

ProSpec RL: Plan Ahead, then Execute

Liu, Liangliang, Guan, Yi, Wang, BoRan, Shen, Rujia, Lin, Yi, Kong, Chaoran, Yan, Lian, Jiang, Jingchi

arXiv.org Artificial IntelligenceJul-31-2024

Imagining potential outcomes of actions before execution helps agents make more informed decisions, a prospective thinking ability fundamental to human cognition. However, mainstream model-free Reinforcement Learning (RL) methods lack the ability to proactively envision future scenarios, plan, and guide strategies. These methods typically rely on trial and error to adjust policy functions, aiming to maximize cumulative rewards or long-term value, even if such high-reward decisions place the environment in extremely dangerous states. To address this, we propose the Prospective (ProSpec) RL method, which makes higher-value, lower-risk optimal decisions by imagining future n-stream trajectories. Specifically, ProSpec employs a dynamic model to predict future states (termed "imagined states") based on the current state and a series of sampled actions. Furthermore, we integrate the concept of Model Predictive Control and introduce a cycle consistency constraint that allows the agent to evaluate and select the optimal actions from these trajectories. Moreover, ProSpec employs cycle consistency to mitigate two fundamental issues in RL: augmenting state reversibility to avoid irreversible events (low risk) and augmenting actions to generate numerous virtual trajectories, thereby improving data efficiency. We validated the effectiveness of our method on the DMControl benchmarks, where our approach achieved significant performance improvements. Code will be open-sourced upon acceptance.

learning, prospec, trajectory, (12 more...)

arXiv.org Artificial Intelligence

2407.21359

Country: