AITopics | Wei, Yufei

Collaborating Authors

Wei, Yufei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BEV-DWPVO: BEV-based Differentiable Weighted Procrustes for Low Scale-drift Monocular Visual Odometry on Ground

Wei, Yufei, Lu, Sha, Lu, Wangtao, Xiong, Rong, Wang, Yue

arXiv.org Artificial IntelligenceFeb-27-2025

Monocular Visual Odometry (MVO) provides a cost-effective, real-time positioning solution for autonomous vehicles. However, MVO systems face the common issue of lacking inherent scale information from monocular cameras. Traditional methods have good interpretability but can only obtain relative scale and suffer from severe scale drift in long-distance tasks. Learning-based methods under perspective view leverage large amounts of training data to acquire prior knowledge and estimate absolute scale by predicting depth values. However, their generalization ability is limited due to the need to accurately estimate the depth of each point. In contrast, we propose a novel MVO system called BEV-DWPVO. Our approach leverages the common assumption of a ground plane, using Bird's-Eye View (BEV) feature maps to represent the environment in a grid-based structure with a unified scale. This enables us to reduce the complexity of pose estimation from 6 Degrees of Freedom (DoF) to 3-DoF. Keypoints are extracted and matched within the BEV space, followed by pose estimation through a differentiable weighted Procrustes solver. The entire system is fully differentiable, supporting end-to-end training with only pose supervision and no auxiliary tasks. We validate BEV-DWPVO on the challenging long-sequence datasets NCLT, Oxford, and KITTI, achieving superior results over existing MVO methods on most evaluation metrics.

artificial intelligence, estimation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.20078

Country:

Europe > Germany (0.14)
Asia > China (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Multi-cam Multi-map Visual Inertial Localization: System, Validation and Dataset

Han, Fuzhang, Wei, Yufei, Jiao, Yanmei, Zhang, Zhuqing, Pan, Yiyuan, Huang, Wenjun, Tang, Li, Yin, Huan, Ding, Xiaqing, Xiong, Rong, Wang, Yue

arXiv.org Artificial IntelligenceDec-5-2024

Map-based localization is crucial for the autonomous movement of robots as it provides real-time positional feedback. However, existing VINS and SLAM systems cannot be directly integrated into the robot's control loop. Although VINS offers high-frequency position estimates, it suffers from drift in long-term operation. And the drift-free trajectory output by SLAM is post-processed with loop correction, which is non-causal. In practical control, it is impossible to update the current pose with future information. Furthermore, existing SLAM evaluation systems measure accuracy after aligning the entire trajectory, which overlooks the transformation error between the odometry start frame and the ground truth frame. To address these issues, we propose a multi-cam multi-map visual inertial localization system, which provides real-time, causal and drift-free position feedback to the robot control loop. Additionally, we analyze the error composition of map-based localization systems and propose a set of evaluation metric suitable for measuring causal localization performance. To validate our system, we design a multi-camera IMU hardware setup and collect a long-term challenging campus dataset. Experimental results demonstrate the higher real-time localization accuracy of the proposed system. To foster community development, both the system and the dataset have been made open source https://github.com/zoeylove/Multi-cam-Multi-map-VILO/tree/main.

algorithm, artificial intelligence, dataset, (18 more...)

arXiv.org Artificial Intelligence

2412.04287

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.65)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

BEV-ODOM: Reducing Scale Drift in Monocular Visual Odometry with BEV Representation

Wei, Yufei, Lu, Sha, Han, Fuzhang, Xiong, Rong, Wang, Yue

arXiv.org Artificial IntelligenceNov-15-2024

Abstract-- Monocular visual odometry (MVO) is vital in autonomous navigation and robotics, providing a cost-effective and flexible motion tracking solution, but the inherent scale ambiguity in monocular setups often leads to cumulative errors over time. In this paper, we present BEV-ODOM, a novel MVO framework leveraging the Bird's Eye View (BEV) Representation to address scale drift. Unlike existing approaches, BEV-ODOM integrates a depth-based perspective-view (PV) to BEV encoder, a correlation feature extraction neck, and a CNN-MLP-based decoder, enabling it to estimate motion across three degrees of freedom without the need for depth supervision or complex optimization techniques. Our framework reduces scale drift in long-term sequences and achieves accurate motion estimation across various datasets, including NCLT, Oxford, and KITTI. In contrast, our method achieves low scale Monocular visual odometry (MVO) has been of interest drift using only pose supervision with BEV representation.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.10195

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

2nd Place Solution for IJCAI-PRICAI 2020 3D AI Challenge: 3D Object Reconstruction from A Single Image

Cao, Yichen, Wei, Yufei, Liu, Shichao, Xu, Lin

arXiv.org Artificial IntelligenceMay-27-2021

In this paper, we present our solution for the {\it IJCAI--PRICAI--20 3D AI Challenge: 3D Object Reconstruction from A Single Image}. We develop a variant of AtlasNet that consumes single 2D images and generates 3D point clouds through 2D to 3D mapping. To push the performance to the limit and present guidance on crucial implementation choices, we conduct extensive experiments to analyze the influence of decoder design and different settings on the normalization, projection, and sampling methods. Our method achieves 2nd place in the final track with a score of $70.88$, a chamfer distance of $36.87$, and a mean f-score of $59.18$. The source code of our method will be available at https://github.com/em-data/Enhanced_AtlasNet_3DReconstruction.

artificial intelligence, neural network, point cloud, (15 more...)

arXiv.org Artificial Intelligence

2105.13575

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback