AITopics | Xu, Lingyun

Collaborating Authors

Xu, Lingyun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation

Qi, Zekun, Zhang, Wenyao, Ding, Yufei, Dong, Runpei, Yu, Xinqiang, Li, Jingwen, Xu, Lingyun, Li, Baoyu, He, Xialin, Fan, Guofan, Zhang, Jiazhao, He, Jiawei, Gu, Jiayuan, Jin, Xin, Ma, Kaisheng, Zhang, Zhizheng, Wang, He, Yi, Li

arXiv.org Artificial IntelligenceFeb-18-2025

Spatial intelligence is a critical component of embodied AI, promoting robots to understand and interact with their environments. While recent advances have enhanced the ability of VLMs to perceive object locations and positional relationships, they still lack the capability to precisely understand object orientations-a key requirement for tasks involving fine-grained manipulations. Addressing this limitation not only requires geometric reasoning but also an expressive and intuitive way to represent orientation. In this context, we propose that natural language offers a more flexible representation space than canonical frames, making it particularly suitable for instruction-following robotic systems. In this paper, we introduce the concept of semantic orientation, which defines object orientations using natural language in a reference-frame-free manner (e.g., the ''plug-in'' direction of a USB or the ''handle'' direction of a knife). To support this, we construct OrienText300K, a large-scale dataset of 3D models annotated with semantic orientations that link geometric understanding to functional semantics. By integrating semantic orientation into a VLM system, we enable robots to generate manipulation actions with both positional and orientational constraints. Extensive experiments in simulation and real world demonstrate that our approach significantly enhances robotic manipulation capabilities, e.g., 48.7% accuracy on Open6DOR and 74.9% accuracy on SIMPLER.

large language model, machine learning, orientation, (23 more...)

arXiv.org Artificial Intelligence

2502.13143

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.67)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.67)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.45)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(5 more...)

Add feedback

General Place Recognition Survey: Towards Real-World Autonomy

Yin, Peng, Jiao, Jianhao, Zhao, Shiqi, Xu, Lingyun, Huang, Guoquan, Choset, Howie, Scherer, Sebastian, Han, Jianda

arXiv.org Artificial IntelligenceMay-8-2024

In the realm of robotics, the quest for achieving real-world autonomy, capable of executing large-scale and long-term operations, has positioned place recognition (PR) as a cornerstone technology. Despite the PR community's remarkable strides over the past two decades, garnering attention from fields like computer vision and robotics, the development of PR methods that sufficiently support real-world robotic systems remains a challenge. This paper aims to bridge this gap by highlighting the crucial role of PR within the framework of Simultaneous Localization and Mapping (SLAM) 2.0. This new phase in robotic navigation calls for scalable, adaptable, and efficient PR solutions by integrating advanced artificial intelligence (AI) technologies. For this goal, we provide a comprehensive review of the current state-of-the-art (SOTA) advancements in PR, alongside the remaining challenges, and underscore its broad applications in robotics. This paper begins with an exploration of PR's formulation and key research challenges. We extensively review literature, focusing on related methods on place representation and solutions to various PR challenges. Applications showcasing PR's potential in robotics, key PR datasets, and open-source libraries are discussed. We also emphasizes our open-source package, aimed at new development and benchmark for general PR. We conclude with a discussion on PR's future directions, accompanied by a summary of the literature covered and access to our open-source library, available to the robotics community at: https://github.com/MetaSLAM/GPRS.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.04812

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Delaware > New Castle County > Newark (0.14)

Genre: Overview (1.00)

Industry:

Health & Medicine (0.67)
Transportation > Ground (0.46)
Information Technology > Robotics & Automation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.92)
(2 more...)

Add feedback

Graph-Guided Deformation for Point Cloud Completion

Shi, Jieqi, Xu, Lingyun, Heng, Liang, Shen, Shaojie

arXiv.org Artificial IntelligenceNov-11-2021

For a long time, the point cloud completion task has been regarded as a pure generation task. After obtaining the global shape code through the encoder, a complete point cloud is generated using the shape priorly learnt by the networks. However, such models are undesirably biased towards prior average objects and inherently limited to fit geometry details. In this paper, we propose a Graph-Guided Deformation Network, which respectively regards the input data and intermediate generation as controlling and supporting points, and models the optimization guided by a graph convolutional network(GCN) for the point cloud completion task. Our key insight is to simulate the least square Laplacian deformation process via mesh deformation methods, which brings adaptivity for modeling variation in geometry details. By this means, we also reduce the gap between the completion task and the mesh deformation algorithms. As far as we know, we are the first to refine the point cloud completion task by mimicing traditional graphics algorithms with GCN-guided deformation. We have conducted extensive experiments on both the simulated indoor dataset ShapeNet, outdoor dataset KITTI, and our self-collected autonomous driving dataset Pandar40. The results show that our method outperforms the existing state-of-the-art algorithms in the 3D point cloud completion task.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2112.0184

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Transportation (0.34)
Automobiles & Trucks (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.71)
Information Technology > Artificial Intelligence > Robots (0.67)

Add feedback

A Multi-Domain Feature Learning Method for Visual Place Recognition

Yin, Peng, Xu, Lingyun, Li, Xueqian, Yin, Chen, Li, Yingli, Srivatsan, Rangaprasad Arun, Li, Lu, Ji, Jianmin, He, Yuqing

arXiv.org Artificial IntelligenceFeb-26-2019

Visual Place Recognition (VPR) is an important component in both computer vision and robotics applications, thanks to its ability to determine whether a place has been visited and where specifically. A major challenge in VPR is to handle changes of environmental conditions including weather, season and illumination. Most VPR methods try to improve the place recognition performance by ignoring the environmental factors, leading to decreased accuracy decreases when environmental conditions change significantly, such as day versus night. To this end, we propose an end-to-end conditional visual place recognition method. Specifically, we introduce the multi-domain feature learning method (MDFL) to capture multiple attribute-descriptions for a given place, and then use a feature detaching module to separate the environmental condition-related features from those that are not. The only label required within this feature learning pipeline is the environmental condition. Evaluation of the proposed method is conducted on the multi-season \textit{NORDLAND} dataset, and the multi-weather \textit{GTAV} dataset. Experimental results show that our method improves the feature robustness against variant environmental conditions.

artificial intelligence, dataset, neural network, (20 more...)

arXiv.org Artificial Intelligence

1902.10058

Country:

Asia > China (0.69)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback