AITopics | visual-inertial odometry

Collaborating Authors

visual-inertial odometry

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dual-Agent Reinforcement Learning for Adaptive and Cost-Aware Visual-Inertial Odometry

Pan, Feiyang, Zheng, Shenghe, Yin, Chunyan, Dou, Guangbin

arXiv.org Artificial IntelligenceNov-27-2025

Visual-Inertial Odometry (VIO) is a critical component for robust ego-motion estimation, enabling foundational capabilities such as autonomous navigation in robotics and real-time 6-DoF tracking for augmented reality. Existing methods face a well-known trade-off: filter-based approaches are efficient but prone to drift, while optimization-based methods, though accurate, rely on computationally prohibitive Visual-Inertial Bundle Adjustment (VIBA) that is difficult to run on resource-constrained platforms. Rather than removing VIBA altogether, we aim to reduce how often and how heavily it must be invoked. To this end, we cast two key design choices in modern VIO, when to run the visual frontend and how strongly to trust its output, as sequential decision problems, and solve them with lightweight reinforcement learning (RL) agents. Our framework introduces a lightweight, dual-pronged RL policy that serves as our core contribution: (1) a Select Agent intelligently gates the entire VO pipeline based only on high-frequency IMU data; and (2) a composite Fusion Agent that first estimates a robust velocity state via a supervised network, before an RL policy adaptively fuses the full (p, v, q) state. Experiments on the EuRoC MAV and TUM-VI datasets show that, in our unified evaluation, the proposed method achieves a more favorable accuracy-efficiency-memory trade-off than prior GPU-based VO/VIO systems: it attains the best average ATE while running up to 1.77 times faster and using less GPU memory. Compared to classical optimization-based VIO systems, our approach maintains competitive trajectory accuracy while substantially reducing computational load.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2511.21083

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

SP-VINS: A Hybrid Stereo Visual Inertial Navigation System based on Implicit Environmental Map

Du, Xueyu, Zhang, Lilian, Duan, Fuan, Luo, Xincan, Wang, Maosong, Wu, Wenqi, JunMao, null

arXiv.org Artificial IntelligenceNov-25-2025

Abstract-- Filter-based visual inertial navigation system (VINS) has attracted mobile-robot researchers for the good balance between accuracy and efficiency, but its limited mapping quality hampers long-term high-accuracy state estimation. T o this end, we first propose a novel filter-based stereo VINS, differing from traditional simultaneous localization and mapping (SLAM) systems based on 3D map, which performs efficient loop closure constraints with implicit environmental map composed of keyframes and 2D keypoints. Secondly, we proposed a hybrid residual filter framework that combines landmark reprojection and ray constraints to construct a unified Ja-cobian matrix for measurement updates. Finally, considering the degraded environment, we incorporated the camera-IMU extrinsic parameters into visual description to achieve online calibration. Benchmark experiments demonstrate that the proposed SP-VINS achieves high computational efficiency while maintaining long-term high-accuracy localization performance, and is superior to existing state-of-the-art (SOT A) methods.

artificial intelligence, environmental map, estimation, (17 more...)

arXiv.org Artificial Intelligence

2511.18756

Genre: Research Report (0.82)

Industry:

Transportation (0.68)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

TCB-VIO: Tightly-Coupled Focal-Plane Binary-Enhanced Visual Inertial Odometry

Lisondra, Matthew, Kim, Junseo, Shimoda, Glenn Takashi, Zareinia, Kourosh, Saeedi, Sajad

arXiv.org Artificial IntelligenceOct-7-2025

Vision algorithms can be executed directly on the image sensor when implemented on the next-generation sensors known as focal-plane sensor-processor arrays (FPSP)s, where every pixel has a processor. FPSPs greatly improve latency, reducing the problems associated with the bottleneck of data transfer from a vision sensor to a processor. FPSPs accelerate vision-based algorithms such as visual-inertial odometry (VIO). However, VIO frameworks suffer from spatial drift due to the vision-based pose estimation, whilst temporal drift arises from the inertial measurements. FPSPs circumvent the spatial drift by operating at a high frame rate to match the high-frequency output of the inertial measurements. In this paper, we present TCB-VIO, a tightly-coupled 6 degrees-of-freedom VIO by a Multi-State Constraint Kalman Filter (MSCKF), operating at a high frame-rate of 250 FPS and from IMU measurements obtained at 400 Hz. TCB-VIO outperforms state-of-the-art methods: ROVIO, VINS-Mono, and ORB-SLAM3.

artificial intelligence, machine learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2510.03919

Country:

North America > Canada (0.28)
Asia > Japan (0.28)

Genre: Research Report (0.84)

Industry: Semiconductors & Electronics (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

Statistical Uncertainty Learning for Robust Visual-Inertial State Estimation

Choi, Seungwon, Park, Donggyu, Hwang, Seo-Yeon, Kim, Tae-Wan

arXiv.org Artificial IntelligenceOct-3-2025

A fundamental challenge in robust visual-inertial odometry (VIO) is to dynamically assess the reliability of sensor measurements. This assessment is crucial for properly weighting the contribution of each measurement to the state estimate. Conventional methods often simplify this by assuming a static, uniform uncertainty for all measurements. This heuristic, however, may be limited in its ability to capture the dynamic error characteristics inherent in real-world data. To improve this limitation, we present a statistical framework that learns measurement reliability assessment online, directly from sensor data and optimization results. Our approach leverages multi-view geometric consistency as a form of self-supervision. This enables the system to infer landmark uncertainty and adaptively weight visual measurements during optimization. We evaluated our method on the public EuRoC dataset, demonstrating improvements in tracking accuracy with average reductions of approximately 24\% in translation error and 42\% in rotation error compared to baseline methods with fixed uncertainty parameters. The resulting framework operates in real time while showing enhanced accuracy and robustness. To facilitate reproducibility and encourage further research, the source code will be made publicly available.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2510.01648

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Robots (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Add feedback

Benchmarking Egocentric Visual-Inertial SLAM at City Scale

Krishnan, Anusha, Liu, Shaohui, Sarlin, Paul-Edouard, Gentilhomme, Oscar, Caruso, David, Monge, Maurizio, Newcombe, Richard, Engel, Jakob, Pollefeys, Marc

arXiv.org Artificial IntelligenceOct-1-2025

Precise 6-DoF simultaneous localization and mapping (SLAM) from onboard sensors is critical for wearable devices capturing egocentric data, which exhibits specific challenges, such as a wider diversity of motions and viewpoints, prevalent dynamic visual content, or long sessions affected by time-varying sensor calibration. While recent progress on SLAM has been swift, academic research is still driven by benchmarks that do not reflect these challenges or do not offer sufficiently accurate ground truth poses. In this paper, we introduce a new dataset and benchmark for visual-inertial SLAM with egocentric, multi-modal data. We record hours and kilometers of trajectories through a city center with glasses-like devices equipped with various sensors. We leverage surveying tools to obtain control points as indirect pose annotations that are metric, centimeter-accurate, and available at city scale. This makes it possible to evaluate extreme trajectories that involve walking at night or traveling in a vehicle. We show that state-of-the-art systems developed by academia are not robust to these challenges and we identify components that are responsible for this. In addition, we design tracks with different levels of difficulty to ease in-depth analysis and evaluation of less mature approaches. The dataset and benchmark are available at https://www.lamaria.ethz.ch.

artificial intelligence, machine learning, sequence, (19 more...)

arXiv.org Artificial Intelligence

2509.26639

Genre: Research Report (1.00)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Observer Design for Optical Flow-Based Visual-Inertial Odometry with Almost-Global Convergence

Bouazza, Tarek, Berkane, Soulaimane, Hua, Minh-Duc, Hamel, Tarek

arXiv.org Artificial IntelligenceSep-1-2025

This paper presents a novel cascaded observer architecture that combines optical flow and IMU measurements to perform continuous monocular visual-inertial odometry (VIO). The proposed solution estimates body-frame velocity and gravity direction simultaneously by fusing velocity direction information from optical flow measurements with gyro and accelerometer data. This fusion is achieved using a globally exponentially stable Riccati observer, which operates under persistently exciting translational motion conditions. The estimated gravity direction in the body frame is then employed, along with an optional magnetometer measurement, to design a complementary observer on $\mathbf{SO}(3)$ for attitude estimation. The resulting interconnected observer architecture is shown to be almost globally asymptotically stable. To extract the velocity direction from sparse optical flow data, a gradient descent algorithm is developed to solve a constrained minimization problem on the unit sphere. The effectiveness of the proposed algorithms is validated through simulation results.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2508.21163

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Add feedback

Event-based Stereo Visual-Inertial Odometry with Voxel Map

Zhang, Zhaoxing, Wang, Xiaoxiang, Zhang, Chengliang, Guo, Yangyang, Yuan, Zikang, Yang, Xin

arXiv.org Artificial IntelligenceJul-1-2025

The event camera, renowned for its high dynamic range and exceptional temporal resolution, is recognized as an important sensor for visual odometry. However, the inherent noise in event streams complicates the selection of high-quality map points, which critically determine the precision of state estimation. To address this challenge, we propose Voxel-ESVIO, an event-based stereo visual-inertial odometry system that utilizes voxel map management, which efficiently filter out high-quality 3D points. Specifically, our methodology utilizes voxel-based point selection and voxel-aware point management to collectively optimize the selection and updating of map points on a per-voxel basis. These synergistic strategies enable the efficient retrieval of noise-resilient map points with the highest observation likelihood in current frames, thereby ensureing the state estimation accuracy. Extensive evaluations on three public benchmarks demonstrate that our Voxel-ESVIO outperforms state-of-the-art methods in both accuracy and computational efficiency.

artificial intelligence, map point, odometry, (11 more...)

arXiv.org Artificial Intelligence

2506.23078

Country: North America > United States > Minnesota (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Robots (0.50)
Information Technology > Artificial Intelligence > Vision (0.47)
Information Technology > Sensing and Signal Processing > Image Processing (0.46)

Add feedback

A Novel ViDAR Device With Visual Inertial Encoder Odometry and Reinforcement Learning-Based Active SLAM Method

Xin, Zhanhua, Wang, Zhihao, Zhang, Shenghao, Chi, Wanchao, Meng, Yan, Kong, Shihan, Xiong, Yan, Zhang, Chong, Liu, Yuzhen, Yu, Junzhi

arXiv.org Artificial IntelligenceJun-17-2025

Abstract--In the field of multi-sensor fusion for simultaneous localization and mapping (SLAM), monocular cameras and IMU s are widely used to build simple and effective visual-inerti al systems. However, limited research has explored the integr ation of motor-encoder devices to enhance SLAM performance. By incorporating such devices, it is possible to significantly improve active capability and field of view (FOV) with minimal additi onal cost and structural complexity. This paper proposes a novel visual-inertial-encoder tightly coupled odometry (VIEO) based on a ViDAR (Video Detection and Ranging) device. A ViDAR calibration method is introduced to ensure accurate initia lization for VIEO. In addition, a platform motion decoupled active SLAM method based on deep reinforcement learning (DRL) is proposed. Experimental data demonstrate that the proposed Vi-DAR and the VIEO algorithm significantly increase cross-fra me co-visibility relationships compared to its correspondin g visual-inertial odometry (VIO) algorithm, improving state estima tion accuracy. The proposed methodolog y sheds fresh insights into both the updated platform design and decoupled approach of active SLAM systems in complex environments. N recent years, visual odometry (VO) and visual-inertial odometry (VIO) have made significant advancements. This work was supported in part by the Beijing Natural Scienc e Foundation under Grant 2022MQ05, in part by the CIE-Tencent Robotics X R hino-Bird Focused Research Program under Grant 2022-07, and in part by the National Natural Science Foundation of China under Grant 62203015, G rant 62303020, Grant 62303021, and Grant 62273351. Zhanhua Xin, Zhihao Wang, Shihan Kong, Y an Xiong, and Junzhi Y u are with the State Key Laboratory for Turbulence and Complex Systems, Department of Advanced Manufacturing and Robotics, C ollege of Engineering, Peking University, Beijing 100871, China (email: xinzhan-hua@stu.pku.edu.cn;

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TII.2025.3567391

2506.131

Country: Asia > China > Beijing > Beijing (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Structureless VIO

Song, Junlin, Olivares-Mendez, Miguel

arXiv.org Artificial IntelligenceJun-17-2025

Visual odometry (VO) is typically considered as a chicken-and-egg problem, as the localization and mapping modules are tightly-coupled. The estimation of a visual map relies on accurate localization information. Meanwhile, localization requires precise map points to provide motion constraints. This classical design principle is naturally inherited by visual-inertial odometry (VIO). Efficient localization solutions that do not require a map have not been fully investigated. To this end, we propose a novel structureless VIO, where the visual map is removed from the odometry framework. Experimental results demonstrated that, compared to the structure-based VIO baseline, our structureless VIO not only substantially improves computational efficiency but also has advantages in accuracy.

artificial intelligence, odometry, vio, (12 more...)

arXiv.org Artificial Intelligence

2505.12337

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Robots (0.78)

Add feedback

Edge-Enabled VIO with Long-Tracked Features for High-Accuracy Low-Altitude IoT Navigation

Huang, Xiaohong, Yang, Cui, Wen, Miaowen

arXiv.org Artificial IntelligenceMay-13-2025

This paper presents a visual-inertial odometry (VIO) method using long-tracked features. Long-tracked features can constrain more visual frames, reducing localization drift. However, they may also lead to accumulated matching errors and drift in feature tracking. Current VIO methods adjust observation weights based on re-projection errors, yet this approach has flaws. Re-projection errors depend on estimated camera poses and map points, so increased errors might come from estimation inaccuracies, not actual feature tracking errors. This can mislead the optimization process and make long-tracked features ineffective for suppressing localization drift. Furthermore, long-tracked features constrain a larger number of frames, which poses a significant challenge to real-time performance of the system. To tackle these issues, we propose an active decoupling mechanism for accumulated errors in long-tracked feature utilization. We introduce a visual reference frame reset strategy to eliminate accumulated tracking errors and a depth prediction strategy to leverage the long-term constraint. To ensure real time preformane, we implement three strategies for efficient system state estimation: a parallel elimination strategy based on predefined elimination order, an inverse-depth elimination simplification strategy, and an elimination skipping strategy. Experiments on various datasets show that our method offers higher positioning accuracy with relatively short consumption time, making it more suitable for edge-enabled low-altitude IoT navigation, where high-accuracy positioning and real-time operation on edge device are required. The code will be published at github.

artificial intelligence, long-tracked feature, real time system, (17 more...)

arXiv.org Artificial Intelligence

2505.06517

Country: Europe > Germany (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)

Add feedback