AITopics | pvr

Visual Reinforcement Learning (RL) methods often require extensive amounts of data. As opposed to model-free RL, model-based RL (MBRL) offers a potential solution with efficient data utilization through planning. Additionally, RL lacks generalization capabilities for real-world tasks. Prior work has shown that incorporating pre-trained visual representations (PVRs) enhances sample efficiency and generalization. While PVRs have been extensively studied in the context of model-free RL, their potential in MBRL remains largely unexplored.

artificial intelligence, machine learning, reinforcement learning, (10 more...)

Neural Information Processing Systems

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?

Neural Information Processing SystemsDec-23-2025, 16:59:28 GMT

We present the largest and most comprehensive empirical study of pre-trained visual representations (PVRs) or visual'foundation models' for Embodied AI.

artificial visual cortex, embodied intelligence, name change, (7 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area > Neurology (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback

Attentive Feature Aggregation or: How Policies Learn to Stop Worrying about Robustness and Attend to Task-Relevant Visual Cues

Tsagkas, Nikolaos, Sochopoulos, Andreas, Danier, Duolikun, Vijayakumar, Sethu, Kouris, Alexandros, Mac Aodha, Oisin, Lu, Chris Xiaoxuan

arXiv.org Artificial IntelligenceNov-17-2025

The adoption of pre-trained visual representations (PVRs), leveraging features from large-scale vision models, has become a popular paradigm for training visuomotor policies. However, these powerful representations can encode a broad range of task-irrelevant scene information, making the resulting trained policies vulnerable to out-of-domain visual changes and distractors. In this work we address visuomotor policy feature pooling as a solution to the observed lack of robustness in perturbed scenes. We achieve this via Attentive Feature Aggregation (AFA), a lightweight, trainable pooling mechanism that learns to naturally attend to task-relevant visual cues, ignoring even semantically rich scene distractors. Through extensive experiments in both simulation and the real world, we demonstrate that policies trained with AFA significantly outperform standard pooling approaches in the presence of visual perturbations, without requiring expensive dataset augmentation or fine-tuning of the PVR. Our findings show that ignoring extraneous visual information is a crucial step towards deploying robust and generalisable visuomotor policies. Project Page: tsagkas.github.io/afa

artificial intelligence, information, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2511.10762

Country: Europe (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

39b1126cdba0a37986fa14f568471ae8-Paper-Conference.pdf

Neural Information Processing SystemsSep-26-2025, 07:47:14 GMT

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Energy > Oil & Gas (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Designing Latent Safety Filters using Pre-Trained Vision Models

Tabbara, Ihab, Yang, Yuxuan, Hamzeh, Ahmad, Astafyev, Maxwell, Sibai, Hussein

arXiv.org Artificial IntelligenceSep-19-2025

Ensuring safety of vision-based control systems remains a major challenge hindering their deployment in critical settings. Safety filters have gained increased interest as effective tools for ensuring the safety of classical control systems, but their applications in vision-based control settings have so far been limited. Pre-trained vision models (PVRs) have been shown to be effective perception backbones for control in various robotics domains. In this paper, we are interested in examining their effectiveness when used for designing vision-based safety filters. We use them as backbones for classifiers defining failure sets, for Hamilton-Jacobi (HJ) reachability-based safety filters, and for latent world models. We discuss the trade-offs between training from scratch, fine-tuning, and freezing the PVRs when training the models they are backbones for. We also evaluate whether one of the PVRs is superior across all tasks, evaluate whether learned world models or Q-functions are better for switching decisions to safe policies, and discuss practical considerations for deploying these PVRs on resource-constrained devices.

artificial intelligence, backbone, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.14758

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

39b1126cdba0a37986fa14f568471ae8-Paper-Conference.pdf

Neural Information Processing SystemsAug-18-2025, 02:47:31 GMT

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: Europe > Germany > Baden-Württemberg (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Energy > Oil & Gas (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning

Neural Information Processing SystemsMay-26-2025, 21:38:41 GMT

Visual Reinforcement Learning (RL) methods often require extensive amounts of data. As opposed to model-free RL, model-based RL (MBRL) offers a potential solution with efficient data utilization through planning. Additionally, RL lacks generalization capabilities for real-world tasks. Prior work has shown that incorporating pre-trained visual representations (PVRs) enhances sample efficiency and generalization. While PVRs have been extensively studied in the context of model-free RL, their potential in MBRL remains largely unexplored.

artificial intelligence, machine learning, reinforcement learning, (7 more...)

Neural Information Processing Systems

Genre: Research Report (0.43)

Technology:

Information Technology > Data Science (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

When Pre-trained Visual Representations Fall Short: Limitations in Visuo-Motor Robot Learning

Tsagkas, Nikolaos, Sochopoulos, Andreas, Danier, Duolikun, Lu, Chris Xiaoxuan, Mac Aodha, Oisin

arXiv.org Artificial IntelligenceFeb-5-2025

The integration of pre-trained visual representations (PVRs) into visuo-motor robot learning has emerged as a promising alternative to training visual encoders from scratch. However, PVRs face critical challenges in the context of policy learning, including temporal entanglement and an inability to generalise even in the presence of minor scene perturbations. These limitations hinder performance in tasks requiring temporal awareness and robustness to scene changes. This work identifies these shortcomings and proposes solutions to address them. First, we augment PVR features with temporal perception and a sense of task completion, effectively disentangling them in time. Second, we introduce a module that learns to selectively attend to task-relevant local features, enhancing robustness when evaluated on out-of-distribution scenes. Our experiments demonstrate significant performance improvements, particularly in PVRs trained with masking objectives, and validate the effectiveness of our enhancements in addressing PVR-specific limitations.

artificial intelligence, machine learning, pvr, (14 more...)

arXiv.org Artificial Intelligence

2502.0327

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Filters

Collaborating Authors

pvr

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

022ca1bed6b574b962c48a2856eb207b-Paper-Conference.pdf

022ca1bed6b574b962c48a2856eb207b-Paper-Conference.pdf

The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning

Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?

Attentive Feature Aggregation or: How Policies Learn to Stop Worrying about Robustness and Attend to Task-Relevant Visual Cues

39b1126cdba0a37986fa14f568471ae8-Paper-Conference.pdf

Designing Latent Safety Filters using Pre-Trained Vision Models

39b1126cdba0a37986fa14f568471ae8-Paper-Conference.pdf

The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning

When Pre-trained Visual Representations Fall Short: Limitations in Visuo-Motor Robot Learning