AITopics | Lei, Boshu

Collaborating Authors

Lei, Boshu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multimodal LLM Guided Exploration and Active Mapping using Fisher Information

Jiang, Wen, Lei, Boshu, Ashton, Katrina, Daniilidis, Kostas

arXiv.org Artificial IntelligenceDec-4-2024

We present an active mapping system that could plan for long-horizon exploration goals and short-term actions with a 3D Gaussian Splatting (3DGS) representation. Existing methods either did not take advantage of recent developments in multimodal Large Language Models (LLM) or did not consider challenges in localization uncertainty, which is critical in embodied agents. We propose employing multimodal LLMs for long-horizon planning in conjunction with detailed motion planning using our information-based algorithm. By leveraging high-quality view synthesis from our 3DGS representation, our method employs a multimodal LLM as a zero-shot planner for long-horizon exploration goals from the semantic perspective. We also introduce an uncertainty-aware path proposal and selection algorithm that balances the dual objectives of maximizing the information gain for the environment while minimizing the cost of localization errors. Experiments conducted on the Gibson and Habitat-Matterport 3D datasets demonstrate state-of-the-art results of the proposed method.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.17422

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting

Strong, Matthew, Lei, Boshu, Swann, Aiden, Jiang, Wen, Daniilidis, Kostas, Kennedy, Monroe III

arXiv.org Artificial IntelligenceNov-19-2024

We propose a framework for active next best view and touch selection for robotic manipulators using 3D Gaussian Splatting (3DGS). 3DGS is emerging as a useful explicit 3D scene representation for robotics, as it has the ability to represent scenes in a both photorealistic and geometrically accurate manner. However, in real-world, online robotic scenes where the number of views is limited given efficiency requirements, random view selection for 3DGS becomes impractical as views are often overlapping and redundant. We address this issue by proposing an end-to-end online training and active view selection pipeline, which enhances the performance of 3DGS in few-view robotics settings. We first elevate the performance of few-shot 3DGS with a novel semantic depth alignment method using Segment Anything Model 2 (SAM2) that we supplement with Pearson depth and surface normal loss to improve color and depth reconstruction of real-world scenes. We then extend FisherRF, a next-best-view selection method for 3DGS, to select views and touch poses based on depth uncertainty. We perform online view selection on a real robot system during live 3DGS training. We motivate our improvements to few-shot GS scenes, and extend depth-based FisherRF to them, where we demonstrate both qualitative and quantitative improvements on challenging robot scenes. For more information, please see our project page at https://arm.stanford.edu/next-best-sense.

artificial intelligence, fisherrf, gaussian splatting, (15 more...)

arXiv.org Artificial Intelligence

2410.0468

Country:

North America > United States > Pennsylvania (0.28)
North America > United States > California > Santa Clara County > Palo Alto (0.24)

Genre: Research Report (0.82)

Industry:

Education > Educational Setting > Online (0.54)
Energy > Oil & Gas > Upstream (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Beyond Uncertainty: Risk-Aware Active View Acquisition for Safe Robot Navigation and 3D Scene Understanding with FisherRF

Liu, Guangyi, Jiang, Wen, Lei, Boshu, Pandey, Vivek, Daniilidis, Kostas, Motee, Nader

arXiv.org Artificial IntelligenceMar-17-2024

This work proposes a novel approach to bolster both the robot's risk assessment and safety measures while deepening its understanding of 3D scenes, which is achieved by leveraging Radiance Field (RF) models and 3D Gaussian Splatting. To further enhance these capabilities, we incorporate additional sampled views from the environment with the RF model. One of our key contributions is the introduction of Risk-aware Environment Masking (RaEM), which prioritizes crucial information by selecting the next-best-view that maximizes the expected information gain. This targeted approach aims to minimize uncertainties surrounding the robot's path and enhance the safety of its navigation. Our method offers a dual benefit: improved robot safety and increased efficiency in risk-aware 3D scene reconstruction and understanding. Extensive experiments in real-world scenarios demonstrate the effectiveness of our proposed approach, highlighting its potential to establish a robust and safety-focused framework for active robot exploration and 3D scene understanding.

artificial intelligence, payload robot, robot, (17 more...)

arXiv.org Artificial Intelligence

2403.11396

Country: North America > United States (0.14)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Multi-Risk-RRT: An Efficient Motion Planning Algorithm for Robotic Autonomous Luggage Trolley Collection at Airports

Sun, Zhirui, Lei, Boshu, Xie, Peijia, Liu, Fugang, Gao, Junjie, Zhang, Ying, Wang, Jiankun

arXiv.org Artificial IntelligenceSep-19-2023

Robots have become increasingly prevalent in dynamic and crowded environments such as airports and shopping malls. In these scenarios, the critical challenges for robot navigation are reliability and timely arrival at predetermined destinations. While existing risk-based motion planning algorithms effectively reduce collision risks with static and dynamic obstacles, there is still a need for significant performance improvements. Specifically, the dynamic environments demand more rapid responses and robust planning. To address this gap, we introduce a novel risk-based multi-directional sampling algorithm, Multi-directional Risk-based Rapidly-exploring Random Tree (Multi-Risk-RRT). Unlike traditional algorithms that solely rely on a rooted tree or double trees for state space exploration, our approach incorporates multiple sub-trees. Each sub-tree independently explores its surrounding environment. At the same time, the primary rooted tree collects the heuristic information from these sub-trees, facilitating rapid progress toward the goal state. Our evaluations, including simulation and real-world environmental studies, demonstrate that Multi-Risk-RRT outperforms existing unidirectional and bi-directional risk-based algorithms in planning efficiency and robustness.

artificial intelligence, efficient motion planning algorithm, robotic autonomous luggage trolley collection, (2 more...)

arXiv.org Artificial Intelligence

2309.11032

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)

Add feedback