AITopics | Wang, Guangming

Plotting

Wang, Guangming

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

3D Gaussian Splatting in Robotics: A Survey

Zhu, Siting, Wang, Guangming, Kong, Xin, Kong, Dezhi, Wang, Hesheng

arXiv.org Artificial IntelligenceDec-18-2024

Dense 3D representations of the environment have been a long-term goal in the robotics field. While previous Neural Radiance Fields (NeRF) representation have been prevalent for its implicit, coordinate-based model, the recent emergence of 3D Gaussian Splatting (3DGS) has demonstrated remarkable potential in its explicit radiance field representation. By leveraging 3D Gaussian primitives for explicit scene representation and enabling differentiable rendering, 3DGS has shown significant advantages over other radiance fields in real-time rendering and photo-realistic performance, which is beneficial for robotic applications. In this survey, we provide a comprehensive understanding of 3DGS in the field of robotics. We divide our discussion of the related works into two main categories: the application of 3DGS and the advancements in 3DGS techniques. In the application section, we explore how 3DGS has been utilized in various robotics tasks from scene understanding and interaction perspectives. The advance of 3DGS section focuses on the improvements of 3DGS own properties in its adaptability and efficiency, aiming to enhance its performance in robotics. We then summarize the most commonly used datasets and evaluation metrics in robotics. Finally, we identify the challenges and limitations of current 3DGS methods and discuss the future development of 3DGS in robotics.

artificial intelligence, arxiv preprint arxiv, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.12262

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Overview (1.00)

Industry:

Education (0.67)
Media (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback

RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning

Wu, Yuxuan, Pan, Lei, Wu, Wenhua, Wang, Guangming, Miao, Yanzi, Wang, Hesheng

arXiv.org Artificial IntelligenceSep-30-2024

Sim-to-Real refers to the process of transferring policies learned in simulation to the real world, which is crucial for achieving practical robotics applications. However, recent Sim2real methods either rely on a large amount of augmented data or large learning models, which is inefficient for specific tasks. In recent years, radiance field-based reconstruction methods, especially the emergence of 3D Gaussian Splatting, making it possible to reproduce realistic real-world scenarios. To this end, we propose a novel real-to-sim-to-real reinforcement learning framework, RL-GSBridge, which introduces a mesh-based 3D Gaussian Splatting method to realize zero-shot sim-to-real transfer for vision-based deep reinforcement learning. We improve the mesh-based 3D GS modeling method by using soft binding constraints, enhancing the rendering quality of mesh models. We then employ a GS editing approach to synchronize rendering with the physics simulator, reflecting the interactions of the physical robot more accurately. Through a series of sim-to-real robotic arm experiments, including grasping and pick-and-place tasks, we demonstrate that RL-GSBridge maintains a satisfactory success rate in real-world task completion during sim-to-real transfer. Furthermore, a series of rendering metrics and visualization results indicate that our proposed mesh-based 3D Gaussian reduces artifacts in unstructured objects, demonstrating more realistic rendering performance.

machine learning, reinforcement learning, simulator, (13 more...)

arXiv.org Artificial Intelligence

2409.20291

Country:

Asia > China (0.15)
Europe > United Kingdom (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Add feedback

SemGauss-SLAM: Dense Semantic Gaussian Splatting SLAM

Zhu, Siting, Qin, Renjie, Wang, Guangming, Liu, Jiuming, Wang, Hesheng

arXiv.org Artificial IntelligenceMay-29-2024

We propose SemGauss-SLAM, a dense semantic SLAM system utilizing 3D Gaussian representation, that enables accurate 3D semantic mapping, robust camera tracking, and high-quality rendering simultaneously. In this system, we incorporate semantic feature embedding into 3D Gaussian representation, which effectively encodes semantic information within the spatial layout of the environment for precise semantic scene representation. Furthermore, we propose feature-level loss for updating 3D Gaussian representation, enabling higher-level guidance for 3D Gaussian optimization. In addition, to reduce cumulative drift in tracking and improve semantic reconstruction accuracy, we introduce semantic-informed bundle adjustment leveraging multi-frame semantic associations for joint optimization of 3D Gaussian representation and camera poses, leading to low-drift tracking and accurate mapping. Our SemGauss-SLAM method demonstrates superior performance over existing radiance field-based SLAM methods in terms of mapping and tracking accuracy on Replica and ScanNet datasets, while also showing excellent capabilities in high-precision semantic segmentation and dense semantic mapping.

artificial intelligence, natural language, text processing, (15 more...)

arXiv.org Artificial Intelligence

2403.07494

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

NeRF in Robotics: A Survey

Wang, Guangming, Pan, Lei, Peng, Songyou, Liu, Shaohui, Xu, Chenfeng, Miao, Yanzi, Zhan, Wei, Tomizuka, Masayoshi, Pollefeys, Marc, Wang, Hesheng

arXiv.org Artificial IntelligenceMay-2-2024

Meticulous 3D environment representations have been a longstanding goal in computer vision and robotics fields. The recent emergence of neural implicit representations has introduced radical innovation to this field as implicit representations enable numerous capabilities. Among these, the Neural Radiance Field (NeRF) has sparked a trend because of the huge representational advantages, such as simplified mathematical models, compact environment storage, and continuous scene representations. Apart from computer vision, NeRF has also shown tremendous potential in the field of robotics. Thus, we create this survey to provide a comprehensive understanding of NeRF in the field of robotics. By exploring the advantages and limitations of NeRF, as well as its current applications and future potential, we hope to shed light on this promising area of research. Our survey is divided into two main sections: \textit{The Application of NeRF in Robotics} and \textit{The Advance of NeRF in Robotics}, from the perspective of how NeRF enters the field of robotics. In the first section, we introduce and analyze some works that have been or could be used in the field of robotics from the perception and interaction perspectives. In the second section, we show some works related to improving NeRF's own properties, which are essential for deploying NeRF in the field of robotics. In the discussion section of the review, we summarize the existing challenges and provide some valuable future research directions for reference.

artificial intelligence, machine learning, radiance field, (17 more...)

arXiv.org Artificial Intelligence

2405.01333

Country:

Asia > China (0.93)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)

Genre:

Overview (1.00)
Research Report (0.81)

Industry:

Energy > Oil & Gas > Upstream (0.67)
Education > Educational Setting > Higher Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.67)

Add feedback

LHMap-loc: Cross-Modal Monocular Localization Using LiDAR Point Cloud Heat Map

Wu, Xinrui, Xu, Jianbo, Hu, Puyuan, Wang, Guangming, Wang, Hesheng

arXiv.org Artificial IntelligenceMar-10-2024

Localization using a monocular camera in the pre-built LiDAR point cloud map has drawn increasing attention in the field of autonomous driving and mobile robotics. However, there are still many challenges (e.g. difficulties of map storage, poor localization robustness in large scenes) in accurately and efficiently implementing cross-modal localization. To solve these problems, a novel pipeline termed LHMap-loc is proposed, which achieves accurate and efficient monocular localization in LiDAR maps. Firstly, feature encoding is carried out on the original LiDAR point cloud map by generating offline heat point clouds, by which the size of the original LiDAR map is compressed. Then, an end-to-end online pose regression network is designed based on optical flow estimation and spatial attention to achieve real-time monocular visual localization in a pre-built map. In addition, a series of experiments have been conducted to prove the effectiveness of the proposed method. Our code is available at: https://github.com/IRMVLab/LHMap-loc.

artificial intelligence, localization, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2403.05002

Country:

Asia > China (0.15)
Europe > United Kingdom (0.14)

Genre: Research Report (0.50)

Industry:

Transportation (0.34)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

SNI-SLAM: Semantic Neural Implicit SLAM

Zhu, Siting, Wang, Guangming, Blum, Hermann, Liu, Jiuming, Song, Liang, Pollefeys, Marc, Wang, Hesheng

arXiv.org Artificial IntelligenceNov-18-2023

We propose SNI-SLAM, a semantic SLAM system utilizing neural implicit representation, that simultaneously performs accurate semantic mapping, high-quality surface reconstruction, and robust camera tracking. In this system, we introduce hierarchical semantic representation to allow multi-level semantic comprehension for top-down structured semantic mapping of the scene. In addition, to fully utilize the correlation between multiple attributes of the environment, we integrate appearance, geometry and semantic features through cross-attention for feature collaboration. This strategy enables a more multifaceted understanding of the environment, thereby allowing SNI-SLAM to remain robust even when single attribute is defective. Then, we design an internal fusion-based decoder to obtain semantic, RGB, Truncated Signed Distance Field (TSDF) values from multi-level features for accurate decoding. Furthermore, we propose a feature loss to update the scene representation at the feature level. Compared with low-level losses such as RGB loss and depth loss, our feature loss is capable of guiding the network optimization on a higher-level. Our SNI-SLAM method demonstrates superior performance over all recent NeRF-based SLAM methods in terms of mapping and tracking accuracy on Replica and ScanNet datasets, while also showing excellent capabilities in accurate semantic segmentation and real-time semantic mapping.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2311.11016

Country:

Europe (0.14)
Asia > China (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

End-to-end 2D-3D Registration between Image and LiDAR Point Cloud for Vehicle Localization

Wang, Guangming, Zheng, Yu, Guo, Yanfeng, Liu, Zhe, Zhu, Yixiang, Burgard, Wolfram, Wang, Hesheng

arXiv.org Artificial IntelligenceJun-20-2023

Robot localization using a previously built map is essential for a variety of tasks including highly accurate navigation and mobile manipulation. A popular approach to robot localization is based on image-to-point cloud registration, which combines illumination-invariant LiDAR-based mapping with economical image-based localization. However, the recent works for image-to-point cloud registration either divide the registration into separate modules or project the point cloud to the depth image to register the RGB and depth images. In this paper, we present I2PNet, a novel end-to-end 2D-3D registration network. I2PNet directly registers the raw 3D point cloud with the 2D RGB image using differential modules with a unique target. The 2D-3D cost volume module for differential 2D-3D association is proposed to bridge feature extraction and pose regression. 2D-3D cost volume module implicitly constructs the soft point-to-pixel correspondence on the intrinsic-independent normalized plane of the pinhole camera model. Moreover, we introduce an outlier mask prediction module to filter the outliers in the 2D-3D association before pose regression. Furthermore, we propose the coarse-to-fine 2D-3D registration architecture to increase localization accuracy. We conduct extensive localization experiments on the KITTI Odometry and nuScenes datasets. The results demonstrate that I2PNet outperforms the state-of-the-art by a large margin. In addition, I2PNet has a higher efficiency than the previous works and can perform the localization in real-time. Moreover, we extend the application of I2PNet to the camera-LiDAR online calibration and demonstrate that I2PNet outperforms recent approaches on the online calibration task.

artificial intelligence, machine learning, registration, (18 more...)

arXiv.org Artificial Intelligence

2306.11346

Country:

Asia > China (0.94)
North America > United States > California (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback