AITopics | Zhao, Shibo

Collaborating Authors

Zhao, Shibo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AirIO: Learning Inertial Odometry with Enhanced IMU Feature Observability

Qiu, Yuheng, Xu, Can, Chen, Yutian, Zhao, Shibo, Geng, Junyi, Scherer, Sebastian

arXiv.org Artificial IntelligenceJan-26-2025

Inertial odometry (IO) using only Inertial Measurement Units (IMUs) offers a lightweight and cost-effective solution for Unmanned Aerial Vehicle (UAV) applications, yet existing learning-based IO models often fail to generalize to UAVs due to the highly dynamic and non-linear-flight patterns that differ from pedestrian motion. In this work, we identify that the conventional practice of transforming raw IMU data to global coordinates undermines the observability of critical kinematic information in UAVs. By preserving the body-frame representation, our method achieves substantial performance improvements, with a 66.7% average increase in accuracy across three datasets. Furthermore, explicitly encoding attitude information into the motion network results in an additional 23.8% improvement over prior results. Combined with a data-driven IMU correction model (AirIMU) and an uncertainty-aware Extended Kalman Filter (EKF), our approach ensures robust state estimation under aggressive UAV maneuvers without relying on external sensors or control inputs. Notably, our method also demonstrates strong generalizability to unseen data not included in the training set, underscoring its potential for real-world UAV applications.

artificial intelligence, dataset, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.15659

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.82)

Industry:

Information Technology > Robotics & Automation (0.88)
Aerospace & Defense > Aircraft (0.66)
Materials > Chemicals > Industrial Gases > Liquified Gas (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Scalable Benchmarking and Robust Learning for Noise-Free Ego-Motion and 3D Reconstruction from Noisy Video

Xu, Xiaohao, Zhang, Tianyi, Zhao, Shibo, Li, Xiang, Wang, Sibo, Chen, Yongqi, Li, Ye, Raj, Bhiksha, Johnson-Roberson, Matthew, Scherer, Sebastian, Huang, Xiaonan

arXiv.org Artificial IntelligenceJan-24-2025

We aim to redefine robust ego-motion estimation and photorealistic 3D reconstruction by addressing a critical limitation: the reliance on noise-free data in existing models. While such sanitized conditions simplify evaluation, they fail to capture the unpredictable, noisy complexities of real-world environments. Dynamic motion, sensor imperfections, and synchronization perturbations lead to sharp performance declines when these models are deployed in practice, revealing an urgent need for frameworks that embrace and excel under real-world noise. To bridge this gap, we tackle three core challenges: scalable data generation, comprehensive benchmarking, and model robustness enhancement. First, we introduce a scalable noisy data synthesis pipeline that generates diverse datasets simulating complex motion, sensor imperfections, and synchronization errors. Second, we leverage this pipeline to create Robust-Ego3D, a benchmark rigorously designed to expose noise-induced performance degradation, highlighting the limitations of current learning-based methods in ego-motion accuracy and 3D reconstruction quality. Third, we propose Correspondence-guided Gaussian Splatting (CorrGS), a novel test-time adaptation method that progressively refines an internal clean 3D representation by aligning noisy observations with rendered RGB-D frames from clean 3D map, enhancing geometric alignment and appearance restoration through visual correspondence. Extensive experiments on synthetic and real-world data demonstrate that CorrGS consistently outperforms prior state-of-the-art methods, particularly in scenarios involving rapid motion and dynamic illumination.

artificial intelligence, machine learning, modeling & simulation, (20 more...)

arXiv.org Artificial Intelligence

2501.14319

Country:

Europe (0.45)
North America > United States > Michigan (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry:

Law (1.00)
Information Technology (1.00)
Government (0.92)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Add feedback

MVCTrack: Boosting 3D Point Cloud Tracking via Multimodal-Guided Virtual Cues

Hu, Zhaofeng, Zhou, Sifan, Zhao, Shibo, Yuan, Zhihang

arXiv.org Artificial IntelligenceDec-13-2024

3D single object tracking is essential in autonomous driving and robotics. Existing methods often struggle with sparse and incomplete point cloud scenarios. To address these limitations, we propose a Multimodal-guided Virtual Cues Projection (MVCP) scheme that generates virtual cues to enrich sparse point clouds. Additionally, we introduce an enhanced tracker MVCTrack based on the generated virtual cues. Specifically, the MVCP scheme seamlessly integrates RGB sensors into LiDAR-based systems, leveraging a set of 2D detections to create dense 3D virtual cues that significantly improve the sparsity of point clouds. These virtual cues can naturally integrate with existing LiDAR-based 3D trackers, yielding substantial performance gains. Extensive experiments demonstrate that our method achieves competitive performance on the NuScenes dataset.

artificial intelligence, point cloud, virtual cue, (15 more...)

arXiv.org Artificial Intelligence

2412.02734

Country: Asia (0.28)

Genre: Research Report (0.64)

Industry:

Transportation (0.34)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.34)

Add feedback

SuperLoc: The Key to Robust LiDAR-Inertial Localization Lies in Predicting Alignment Risks

Zhao, Shibo, Zhu, Honghao, Gao, Yuanjun, Kim, Beomsoo, Qiu, Yuheng, Johnson, Aaron M., Scherer, Sebastian

arXiv.org Artificial IntelligenceDec-3-2024

Map-based LiDAR localization, while widely used in autonomous systems, faces significant challenges in degraded environments due to lacking distinct geometric features. This paper introduces SuperLoc, a robust LiDAR localization package that addresses key limitations in existing methods. SuperLoc features a novel predictive alignment risk assessment technique, enabling early detection and mitigation of potential failures before optimization. This approach significantly improves performance in challenging scenarios such as corridors, tunnels, and caves. Unlike existing degeneracy mitigation algorithms that rely on post-optimization analysis and heuristic thresholds, SuperLoc evaluates the localizability of raw sensor measurements. Experimental results demonstrate significant performance improvements over state-of-the-art methods across various degraded environments. Our approach achieves a 54% increase in accuracy and exhibits the highest robustness. To facilitate further research, we release our implementation along with datasets from eight challenging scenarios

artificial intelligence, localization, observability, (16 more...)

arXiv.org Artificial Intelligence

2412.02901

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)

Add feedback

SubT-MRS Dataset: Pushing SLAM Towards All-weather Environments

Zhao, Shibo, Gao, Yuanjun, Wu, Tianhao, Singh, Damanpreet, Jiang, Rushan, Sun, Haoxiang, Sarawata, Mansi, Qiu, Yuheng, Whittaker, Warren, Higgins, Ian, Du, Yi, Su, Shaoshu, Xu, Can, Keller, John, Karhade, Jay, Nogueira, Lucas, Saha, Sourojit, Zhang, Ji, Wang, Wenshan, Wang, Chen, Scherer, Sebastian

arXiv.org Artificial IntelligenceDec-31-2023

Simultaneous localization and mapping (SLAM) is a fundamental task for numerous applications such as autonomous navigation and exploration. Despite many SLAM datasets have been released, current SLAM solutions still struggle to have sustained and resilient performance. One major issue is the absence of high-quality datasets including diverse all-weather conditions and a reliable metric for assessing robustness. This limitation significantly restricts the scalability and generalizability of SLAM technologies, impacting their development, validation, and deployment. To address this problem, we present SubT-MRS, an extremely challenging real-world dataset designed to push SLAM towards all-weather environments to pursue the most robust SLAM performance. It contains multi-degraded environments including over 30 diverse scenes such as structureless corridors, varying lighting conditions, and perceptual obscurants like smoke and dust; multimodal sensors such as LiDAR, fisheye camera, IMU, and thermal camera; and multiple locomotions like aerial, legged, and wheeled robots. We develop accuracy and robustness evaluation tracks for SLAM and introduced novel robustness metrics. Comprehensive studies are performed, revealing new observations, challenges, and opportunities for future research.

artificial intelligence, dataset, trajectory, (14 more...)

arXiv.org Artificial Intelligence

2307.07607

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.46)

Add feedback

Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis

Hu, Yafei, Xie, Quanting, Jain, Vidhi, Francis, Jonathan, Patrikar, Jay, Keetha, Nikhil, Kim, Seungchan, Xie, Yaqi, Zhang, Tianyi, Zhao, Shibo, Chong, Yu Quan, Wang, Chen, Sycara, Katia, Johnson-Roberson, Matthew, Batra, Dhruv, Wang, Xiaolong, Scherer, Sebastian, Kira, Zsolt, Xia, Fei, Bisk, Yonatan

arXiv.org Artificial IntelligenceDec-15-2023

Building general-purpose robots that can operate seamlessly, in any environment, with any object, and utilizing various skills to complete diverse tasks has been a long-standing goal in Artificial Intelligence. Unfortunately, however, most existing robotic systems have been constrained - having been designed for specific tasks, trained on specific datasets, and deployed within specific environments. These systems usually require extensively-labeled data, rely on task-specific models, have numerous generalization issues when deployed in real-world scenarios, and struggle to remain robust to distribution shifts. Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models (i.e., foundation models) in research fields such as Natural Language Processing (NLP) and Computer Vision (CV), we devote this survey to exploring (i) how these existing foundation models from NLP and CV can be applied to the field of robotics, and also exploring (ii) what a robotics-specific foundation model would look like. We begin by providing an overview of what constitutes a conventional robotic system and the fundamental barriers to making it universally applicable. Next, we establish a taxonomy to discuss current work exploring ways to leverage existing foundation models for robotics and develop ones catered to robotics. Finally, we discuss key challenges and promising future directions in using foundation models for enabling general-purpose robotic systems. We encourage readers to view our living GitHub repository of resources, including papers reviewed in this survey as well as related projects and repositories for developing foundation models for robotics.

large language model, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2312.08782

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Education (1.00)
Energy (0.67)
Health & Medicine > Therapeutic Area > Neurology (0.67)
Leisure & Entertainment > Games > Computer Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

Add feedback

PyPose: A Library for Robot Learning with Physics-based Optimization

Wang, Chen, Gao, Dasong, Xu, Kuan, Geng, Junyi, Hu, Yaoyu, Qiu, Yuheng, Li, Bowen, Yang, Fan, Moon, Brady, Pandey, Abhinav, Aryan, null, Xu, Jiahe, Wu, Tianhao, He, Haonan, Huang, Daning, Ren, Zhongqiang, Zhao, Shibo, Fu, Taimeng, Reddy, Pranay, Lin, Xiao, Wang, Wenshan, Shi, Jingnan, Talak, Rajat, Cao, Kun, Du, Yi, Wang, Han, Yu, Huai, Wang, Shanzhao, Chen, Siyu, Kashyap, Ananth, Bandaru, Rohan, Dantu, Karthik, Wu, Jiajun, Xie, Lihua, Carlone, Luca, Hutter, Marco, Scherer, Sebastian

arXiv.org Artificial IntelligenceMar-24-2023

Deep learning has had remarkable success in robotic perception, but its data-centric nature suffers when it comes to generalizing to ever-changing environments. By contrast, physics-based optimization generalizes better, but it does not perform as well in complicated tasks due to the lack of high-level semantic information and reliance on manual parametric tuning. To take advantage of these two complementary worlds, we present PyPose: a robotics-oriented, PyTorch-based library that combines deep perceptual models with physics-based optimization. PyPose's architecture is tidy and well-organized, it has an imperative style interface and is efficient and user-friendly, making it easy to integrate into real-world robotic applications. Besides, it supports parallel computing of any order gradients of Lie groups and Lie algebras and $2^{\text{nd}}$-order optimizers, such as trust region methods. Experiments show that PyPose achieves more than $10\times$ speedup in computation compared to the state-of-the-art libraries. To boost future research, we provide concrete examples for several fields of robot learning, including SLAM, planning, control, and inertial navigation.

artificial intelligence, machine learning, optimization, (17 more...)

arXiv.org Artificial Intelligence

2209.15428

Country:

Asia > China (0.68)
Europe (0.68)
North America > United States > Massachusetts > Middlesex County (0.28)

Genre: Research Report (0.50)

Industry:

Energy > Oil & Gas (0.47)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Kinetic Energy Plus Penalty Functions for Sparse Estimation

Zhang, Zhihua, Zhao, Shibo, Shen, Zebang, Zhou, Shuchang

arXiv.org Machine LearningJul-6-2014

In this paper we propose and study a family of sparsity-inducing penalty functions. Since the penalty functions are related to the kinetic energy in special relativity, we call them \emph{kinetic energy plus} (KEP) functions. We construct the KEP function by using the concave conjugate of a $\chi^2$-distance function and present several novel insights into the KEP function with $q=1$. In particular, we derive a thresholding operator based on the KEP function, and prove its mathematical properties and asymptotic properties in sparsity modeling. Moreover, we show that a coordinate descent algorithm is especially appropriate for the KEP function. Additionally, we discuss the relationship of KEP with the penalty functions $\ell_{1/2}$ and MCP. The theoretical and empirical analysis validates that the KEP function is effective and efficient in high-dimensional data modeling.

artificial intelligence, machine learning, penalty, (16 more...)

arXiv.org Machine Learning

1307.5601

Country:

Asia > China (0.46)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback