AITopics | perception module

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language

Neural Information Processing SystemsApr-24-2026, 13:08:30 GMT

In this work, we propose a unified framework, called Visual Reasoning with Differentiable Physics (VRDP) 1, that can jointly learn visual concepts and infer physics models of objects and their interactions from videos and language. This is achieved by seamlessly integrating three components: a visual perception module, a concept learner, and a differentiable physics engine. The visual perception module parses each video frame into object-centric trajectories and represents them as latent scene representations. The concept learner grounds visual concepts (e.g., color, shape, and material) from these object-centric representations based on the language, thus providing prior knowledge for the physics engine. The differentiable physics model, implemented as an impulse-based differentiable rigid-body simulator, performs differentiable physical simulation based on the grounded concepts to infer physical properties, such as mass, restitution, and velocity, by fitting the simulated trajectories into the video observations. Consequently, these learned concepts and physical models can explain what we have seen and imagine what is about to happen in future and counterfactual scenarios.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(3 more...)

Add feedback

Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations

Kevin Smith, Lingjie Mei, Shunyu Yao, Jiajun Wu, Elizabeth Spelke, Josh Tenenbaum, Tomer Ullman

Neural Information Processing SystemsFeb-14-2026, 21:28:39 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, object-oriented architecture, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

6822951732be44edf818dc5a97d32ca6-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 17:54:58 GMT

keypoint, module, neural information processing system, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Add feedback

6822951732be44edf818dc5a97d32ca6-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 17:54:51 GMT

graph, keypoint, module, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre:

Research Report > Strength High (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Cognitive Science (0.68)

Add feedback

164687cb815daae754d33364716e65e6-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 06:55:28 GMT

artificial intelligence, machine learning, representation, (17 more...)

Neural Information Processing Systems

Country:

North America > Puerto Rico > San Juan > San Juan (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

07845cd9aefa6cde3f8926d25138a3a2-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 08:56:14 GMT

physical parameter, physics model, reasoning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(3 more...)

Add feedback

Gait-Adaptive Perceptive Humanoid Locomotion with Real-Time Under-Base Terrain Reconstruction

Song, Haolin, Zhu, Hongbo, Yu, Tao, Liu, Yan, Yuan, Mingqi, Zhou, Wengang, Chen, Hua, Li, Houqiang

arXiv.org Artificial IntelligenceDec-9-2025

Abstract-- For full-size humanoid robots, even with recent advances in reinforcement learning-based control, achieving reliable locomotion on complex terrains, such as long staircases, remains challenging. In such settings, limited perception, ambiguous terrain cues, and insufficient adaptation of gait timing can cause even a single misplaced or mistimed step to result in rapid loss of balance. We introduce a perceptive locomotion framework that merges terrain sensing, gait regulation, and whole-body control into a single reinforcement learning policy. A downward-facing depth camera mounted under the base observes the support region around the feet, and a compact U-Net reconstructs a dense egocentric height map from each frame in real time, operating at the same frequency as the control loop. The perceptual height map, together with proprioceptive observations, is processed by a unified policy that produces joint commands and a global stepping-phase signal, allowing gait timing and whole-body posture to be adapted jointly to the commanded motion and local terrain geometry. We further adopt a single-stage successive teacher-student training scheme for efficient policy learning and knowledge transfer . Experiments conducted on a 31-DoF, 1.65 m humanoid robot demonstrate robust locomotion in both simulation and real-world settings, including forward and backward stair ascent and descent, as well as crossing a 46 cm gap.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2512.07464

Country: Asia > China (0.69)

Genre: Research Report (0.40)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

Learning to See Physics via Visual De-animation

Jiajun Wu, Erika Lu, Pushmeet Kohli, Bill Freeman, Josh Tenenbaum

Neural Information Processing SystemsNov-21-2025, 08:32:06 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, engine, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Your Ride, Your Rules: Psychology and Cognition Enabled Automated Driving Systems

Bao, Zhipeng, Li, Qianwen

arXiv.org Artificial IntelligenceNov-17-2025

Despite rapid advances in autonomous driving technology, current autonomous vehicles (AVs) primarily respond to external traffic conditions and treat humans as passive occupants, lacking mechanisms for active adaptation and collaboration. This limitation c onstrains their ability to personalize driving behavior to human expectations and hinders effective navigation of ambiguous traffic scenarios that could benefit from leveraging the occupant's advanced cognitive input, resulting in increased delays and pote ntial safety risks. This inadequacy in the long term undermines occupant trust and hinder s the widespread adoption of AV technologies. This research is motivated to propose PACE - ADS (Psychology and Cognition Enabled Automated Driving Systems): a human - centered autonomy framework that enables AVs to sense, interpret, and respond to both external traffic conditions and internal occupant states. PACE - ADS is built on an agentic workflow where three foundation model agents collaborate: the Driver Age nt interprets the external environment; the Psychologist Agent decodes passive psychological signals ( e.g., facial expressions) and active cognitive inputs (e.g., verbal commands); and the Coordinator Agent synthesizes these inputs to generate high - level driving behavior decisions and parameters that enhance responsiveness in ambiguous scenarios and person alize the ride. PACE - ADS is designed to complement, rather than replace, conventional AV modules. It operates at the low - frequency semantic planning layer while delegating low - level, high - frequency control to the vehicle's native systems.

artificial intelligence, occupant, vehicle, (17 more...)

arXiv.org Artificial Intelligence

2506.11842

Country:

Asia > China (0.28)
North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

ATOM-CBF: Adaptive Safe Perception-Based Control under Out-of-Distribution Measurements

Yun, Kai S., Azizan, Navid

arXiv.org Artificial IntelligenceNov-14-2025

Ensuring the safety of real-world systems is challenging, especially when they rely on learned perception modules to infer the system state from high-dimensional sensor data. These perception modules are vulnerable to epistemic uncertainty, often failing when encountering out-of-distribution (OoD) measurements not seen during training. To address this gap, we introduce ATOM-CBF (Adaptive-To-OoD-Measurement Control Barrier Function), a novel safe control framework that explicitly computes and adapts to the epistemic uncertainty from OoD measurements, without the need for ground-truth labels or information on distribution shifts. Our approach features two key components: (1) an OoD-aware adaptive perception error margin and (2) a safety filter that integrates this adaptive error margin, enabling the filter to adjust its conservatism in real-time. We provide empirical validation in simulations, demonstrating that ATOM-CBF maintains safety for an F1Tenth vehicle with LiDAR scans and a quadruped robot with RGB images.

artificial intelligence, machine learning, tom-cbf, (16 more...)

arXiv.org Artificial Intelligence

2511.08741

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Filters

Collaborating Authors

perception module

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language

Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations

6822951732be44edf818dc5a97d32ca6-Supplemental.pdf

6822951732be44edf818dc5a97d32ca6-Paper.pdf

164687cb815daae754d33364716e65e6-Paper-Conference.pdf

07845cd9aefa6cde3f8926d25138a3a2-Paper.pdf

Gait-Adaptive Perceptive Humanoid Locomotion with Real-Time Under-Base Terrain Reconstruction

Learning to See Physics via Visual De-animation

Your Ride, Your Rules: Psychology and Cognition Enabled Automated Driving Systems

ATOM-CBF: Adaptive Safe Perception-Based Control under Out-of-Distribution Measurements