AITopics | playvirtual

2a38a4a9316c49e5a833517c45d31070-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 00:12:40 GMT

learning, representation, trajectory, (13 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.98)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

Neural Information Processing SystemsDec-23-2025, 22:17:01 GMT

Learning good feature representations is important for deep reinforcement learning (RL). However, with limited experience, RL often suffers from data inefficiency for training. For un-experienced or less-experienced trajectories (i.e., state-action sequences), the lack of data limits the use of them for better feature learning. In this work, we propose a novel method, dubbed PlayVirtual, which augments cycle-consistent virtual trajectories to enhance the data efficiency for RL feature representation learning. Specifically, PlayVirtual predicts future states in a latent space based on the current state and action by a dynamics model and then predicts the previous states by a backward dynamics model, which forms a trajectory cycle. Based on this, we augment the actions to generate a large amount of virtual state-action trajectories. Being free of groudtruth state supervision, we enforce a trajectory to meet the cycle consistency constraint, which can significantly enhance the data efficiency.

augmenting cycle-consistent virtual trajectory, name change, playvirtual, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.63)

Add feedback

PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning Tao Y u

Neural Information Processing SystemsOct-9-2025, 14:35:50 GMT

Specifically, PlayVirtual predicts future states in a latent space based on the current state and action by a dynamics model and then predicts the previous states by a backward dynamics model, which forms a trajectory cycle.

machine learning, reinforcement learning, trajectory, (15 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

Neural Information Processing SystemsOct-9-2024, 21:25:35 GMT

Learning good feature representations is important for deep reinforcement learning (RL). However, with limited experience, RL often suffers from data inefficiency for training. For un-experienced or less-experienced trajectories (i.e., state-action sequences), the lack of data limits the use of them for better feature learning. In this work, we propose a novel method, dubbed PlayVirtual, which augments cycle-consistent virtual trajectories to enhance the data efficiency for RL feature representation learning. Specifically, PlayVirtual predicts future states in a latent space based on the current state and action by a dynamics model and then predicts the previous states by a backward dynamics model, which forms a trajectory cycle.

augmenting cycle-consistent virtual trajectory, playvirtual, reinforcement learning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)

Add feedback

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning

Yue, Yang, Kang, Bingyi, Xu, Zhongwen, Huang, Gao, Yan, Shuicheng

arXiv.org Artificial IntelligenceAug-16-2022

Deep reinforcement learning (RL) algorithms suffer severe performance degradation when the interaction data is scarce, which limits their real-world application. Recently, visual representation learning has been shown to be effective and promising for boosting sample efficiency in RL. These methods usually rely on contrastive learning and data augmentation to train a transition model for state prediction, which is different from how the model is used in RL--performing value-based planning. Accordingly, the learned representation by these visual methods may be good for recognition but not optimal for estimating state value and solving the decision problem. To address this issue, we propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making. More specifically, VCR trains a model to predict the future state (also referred to as the ''imagined state'') based on the current one and a sequence of actions. Instead of aligning this imagined state with a real state returned by the environment, VCR applies a $Q$-value head on both states and obtains two distributions of action values. Then a distance is computed and minimized to force the imagined state to produce a similar action value prediction as that by the real state. We develop two implementations of the above idea for the discrete and continuous action spaces respectively. We conduct experiments on Atari 100K and DeepMind Control Suite benchmarks to validate their effectiveness for improving sample efficiency. It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.

learning, representation, vcr, (16 more...)

arXiv.org Artificial Intelligence

2206.12542

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Collaborating Authors

playvirtual

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

2a38a4a9316c49e5a833517c45d31070-Paper.pdf

PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning Tao Y u

PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning