AITopics | relocalization

Collaborating Authors

relocalization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

When and Where Localization Fails: An Analysis of the Iterative Closest Point in Evolving Environment

Dannaoui, Abdel-Raouf, Laconte, Johann, Debain, Christophe, Pomerleau, Francois, Checchin, Paul

arXiv.org Artificial IntelligenceJul-24-2025

Robust relocalization in dynamic outdoor environments remains a key challenge for autonomous systems relying on 3D lidar. While long-term localization has been widely studied, short-term environmental changes, occurring over days or weeks, remain underexplored despite their practical significance. To address this gap, we present a highresolution, short-term multi-temporal dataset collected weekly from February to April 2025 across natural and semi-urban settings. Each session includes high-density point cloud maps, 360 deg panoramic images, and trajectory data. Projected lidar scans, derived from the point cloud maps and modeled with sensor-accurate occlusions, are used to evaluate alignment accuracy against the ground truth using two Iterative Closest Point (ICP) variants: Point-to-Point and Point-to-Plane. Results show that Point-to-Plane offers significantly more stable and accurate registration, particularly in areas with sparse features or dense vegetation. This study provides a structured dataset for evaluating short-term localization robustness, a reproducible framework for analyzing scan-to-map alignment under noise, and a comparative evaluation of ICP performance in evolving outdoor environments. Our analysis underscores how local geometry and environmental variability affect localization success, offering insights for designing more resilient robotic systems.

artificial intelligence, dataset, localization, (15 more...)

arXiv.org Artificial Intelligence

2507.17531

Country:

Europe > France (0.15)
North America > Canada (0.14)

Genre:

Research Report > Experimental Study (0.48)
Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Language-Grounded Hierarchical Planning and Execution with Multi-Robot 3D Scene Graphs

Strader, Jared, Ray, Aaron, Arkin, Jacob, Peterson, Mason B., Chang, Yun, Hughes, Nathan, Bradley, Christopher, Jia, Yi Xuan, Nieto-Granda, Carlos, Talak, Rajat, Fan, Chuchu, Carlone, Luca, How, Jonathan P., Roy, Nicholas

arXiv.org Artificial IntelligenceJul-14-2025

In this paper, we introduce a multi-robot system that integrates mapping, localization, and task and motion planning (TAMP) enabled by 3D scene graphs to execute complex instructions expressed in natural language. Our system builds a shared 3D scene graph incorporating an open-set object-based map, which is leveraged for multi-robot 3D scene graph fusion. This representation supports real-time, view-invariant relocalization (via the object-based map) and planning (via the 3D scene graph), allowing a team of robots to reason about their surroundings and execute complex tasks. Additionally, we introduce a planning approach that translates operator intent into Planning Domain Definition Language (PDDL) goals using a Large Language Model (LLM) by leveraging context from the shared 3D scene graph and robot capabilities. We provide an experimental assessment of the performance of our system on real-world tasks in large-scale, outdoor environments. A supplementary video is available at https://youtu.be/8xbGGOLfLAY.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2506.07454

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

MVL-Loc: Leveraging Vision-Language Model for Generalizable Multi-Scene Camera Relocalization

Xiao, Zhendong, Wei, Wu, Ji, Shujie, Yang, Shan, Chen, Changhao

arXiv.org Artificial IntelligenceJul-8-2025

Camera relocalization, a cornerstone capability of modern computer vision, accurately determines a camera's position and orientation (6-DoF) from images and is essential for applications in augmented reality (AR), mixed reality (MR), autonomous driving, delivery drones, and robotic navigation. Unlike traditional deep learning-based methods that regress camera pose from images in a single scene, which often lack generalization and robustness in diverse environments, we propose MVL-Loc, a novel end-to-end multi-scene 6-DoF camera relocalization framework. MVL-Loc leverages pretrained world knowledge from vision-language models (VLMs) and incorporates multimodal data to generalize across both indoor and outdoor settings. Furthermore, natural language is employed as a directive tool to guide the multi-scene learning process, facilitating semantic understanding of complex scenes and capturing spatial relationships among objects. Extensive experiments on the 7Scenes and Cambridge Landmarks datasets demonstrate MVL-Loc's robustness and state-of-the-art performance in real-world multi-scene camera relocalization, with improved accuracy in both positional and orientational estimates.

machine learning, natural language, relocalization, (15 more...)

arXiv.org Artificial Intelligence

2507.04509

Country:

Asia > China > Guangdong Province > Guangzhou (0.05)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.82)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

R3GS: Gaussian Splatting for Robust Reconstruction and Relocalization in Unconstrained Image Collections

yan, Xu, Wang, Zhaohui, Wei, Rong, Yu, Jingbo, Li, Dong, Liu, Xiangde

arXiv.org Artificial IntelligenceMay-22-2025

We propose R3GS, a robust reconstruction and relocalization framework tailored for unconstrained datasets. Our method uses a hybrid representation during training. Each anchor combines a global feature from a convolutional neural network (CNN) with a local feature encoded by the multiresolution hash grids [2]. Subsequently, several shallow multi-layer perceptrons (MLPs) predict the attributes of each Gaussians, including color, opacity, and covariance. To mitigate the adverse effects of transient objects on the reconstruction process, we ffne-tune a lightweight human detection network. Once ffne-tuned, this network generates a visibility map that efffciently generalizes to other transient objects (such as posters, banners, and cars) with minimal need for further adaptation. Additionally, to address the challenges posed by sky regions in outdoor scenes, we propose an effective sky-handling technique that incorporates a depth prior as a constraint. This allows the inffnitely distant sky to be represented on the surface of a large-radius sky sphere, signiffcantly reducing ffoaters caused by errors in sky reconstruction. Furthermore, we introduce a novel relocalization method that remains robust to changes in lighting conditions while estimating the camera pose of a given image within the reconstructed 3DGS scene. As a result, R3GS significantly enhances rendering ffdelity, improves both training and rendering efffciency, and reduces storage requirements. Our method achieves state-of-the-art performance compared to baseline methods on in-the-wild datasets. The code will be made open-source following the acceptance of the paper.

artificial intelligence, gaussian, machine learning, (10 more...)

arXiv.org Artificial Intelligence

2505.15294

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Add feedback

Leveraging Semantic Graphs for Efficient and Robust LiDAR SLAM

Wang, Neng, Lu, Huimin, Zheng, Zhiqiang, Wang, Hesheng, Liu, Yun-Hui, Chen, Xieyuanli

arXiv.org Artificial IntelligenceMar-14-2025

Accurate and robust simultaneous localization and mapping (SLAM) is crucial for autonomous mobile systems, typically achieved by leveraging the geometric features of the environment. Incorporating semantics provides a richer scene representation that not only enhances localization accuracy in SLAM but also enables advanced cognitive functionalities for downstream navigation and planning tasks. Existing point-wise semantic LiDAR SLAM methods often suffer from poor efficiency and generalization, making them less robust in diverse real-world scenarios. In this paper, we propose a semantic graph-enhanced SLAM framework, named SG-SLAM, which effectively leverages the geometric, semantic, and topological characteristics inherent in environmental structures. The semantic graph serves as a fundamental component that facilitates critical functionalities of SLAM, including robust relocalization during odometry failures, accurate loop closing, and semantic graph map construction. Our method employs a dual-threaded architecture, with one thread dedicated to online odometry and relocalization, while the other handles loop closure, pose graph optimization, and map update. This design enables our method to operate in real time and generate globally consistent semantic graph maps and point cloud maps. We extensively evaluate our method across the KITTI, MulRAN, and Apollo datasets, and the results demonstrate its superiority compared to state-of-the-art methods. Our method has been released at https://github.com/nubot-nudt/SG-SLAM.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.11145

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

SCORE: Saturated Consensus Relocalization in Semantic Line Maps

Jiang, Haodong, Zheng, Xiang, Zhang, Yanglin, Zeng, Qingcheng, Li, Yiqian, Hong, Ziyang, Wu, Junfeng

arXiv.org Artificial IntelligenceMar-5-2025

This is the arxiv version for our paper submitted to IEEE/RSJ IROS 2025. We propose a scene-agnostic and light-weight visual relocalization framework that leverages semantically labeled 3D lines as a compact map representation. In our framework, the robot localizes itself by capturing a single image, extracting 2D lines, associating them with semantically similar 3D lines in the map, and solving a robust perspective-n-line problem. To address the extremely high outlier ratios~(exceeding 99.5\%) caused by one-to-many ambiguities in semantic matching, we introduce the Saturated Consensus Maximization~(Sat-CM) formulation, which enables accurate pose estimation when the classic Consensus Maximization framework fails. We further propose a fast global solver to the formulated Sat-CM problems, leveraging rigorous interval analysis results to ensure both accuracy and computational efficiency. Additionally, we develop a pipeline for constructing semantic 3D line maps using posed depth images. To validate the effectiveness of our framework, which integrates our innovations in robust estimation and practical engineering insights, we conduct extensive experiments on the ScanNet++ dataset.

line map, outlier ratio, sat-cm framework, (16 more...)

arXiv.org Artificial Intelligence

2503.03254

Country:

Asia > China > Hong Kong (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
(2 more...)

Add feedback

An Efficient Scene Coordinate Encoding and Relocalization Method

Xu, Kuan, Jiang, Zeyu, Cao, Haozhi, Yuan, Shenghai, Wang, Chen, Xie, Lihua

arXiv.org Artificial IntelligenceDec-9-2024

Scene Coordinate Regression (SCR) is a visual localization technique that utilizes deep neural networks (DNN) to directly regress 2D-3D correspondences for camera pose estimation. However, current SCR methods often face challenges in handling repetitive textures and meaningless areas due to their reliance on implicit triangulation. In this paper, we propose an efficient scene coordinate encoding and relocalization method. Compared with the existing SCR methods, we design a unified architecture for both scene encoding and salient keypoint detection, enabling our system to focus on encoding informative regions, thereby significantly enhancing efficiency. Additionally, we introduce a mechanism that leverages sequential information during both map encoding and relocalization, which strengthens implicit triangulation, particularly in repetitive texture environments. Comprehensive experiments conducted across indoor and outdoor datasets demonstrate that the proposed system outperforms other state-of-the-art (SOTA) SCR methods. Our single-frame relocalization mode improves the recall rate of our baseline by 6.4% and increases the running speed from 56Hz to 90Hz. Furthermore, our sequence-based mode increases the recall rate by 11% while maintaining the original efficiency.

machine learning, natural language, relocalization, (19 more...)

arXiv.org Artificial Intelligence

2412.06488

Country:

Asia > Singapore (0.04)
North America > United States > New York > Erie County > Buffalo (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.97)
(2 more...)

Add feedback

Cross-Modal Visual Relocalization in Prior LiDAR Maps Utilizing Intensity Textures

Shen, Qiyuan, Zhao, Hengwang, Yan, Weihao, Wang, Chunxiang, Qin, Tong, Yang, Ming

arXiv.org Artificial IntelligenceDec-2-2024

Cross-modal localization has drawn increasing attention in recent years, while the visual relocalization in prior LiDAR maps is less studied. Related methods usually suffer from inconsistency between the 2D texture and 3D geometry, neglecting the intensity features in the LiDAR point cloud. In this paper, we propose a cross-modal visual relocalization system in prior LiDAR maps utilizing intensity textures, which consists of three main modules: map projection, coarse retrieval, and fine relocalization. In the map projection module, we construct the database of intensity channel map images leveraging the dense characteristic of panoramic projection. The coarse retrieval module retrieves the top-K most similar map images to the query image from the database, and retains the top-K' results by covisibility clustering. The fine relocalization module applies a two-stage 2D-3D association and a covisibility inlier selection method to obtain robust correspondences for 6DoF pose estimation. The experimental results on our self-collected datasets demonstrate the effectiveness in both place recognition and pose estimation tasks.

localization, map image, relocalization, (13 more...)

arXiv.org Artificial Intelligence

2412.01299

Country: Asia > China > Shanghai > Shanghai (0.05)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Map-Free Visual Relocalization Enhanced by Instance Knowledge and Depth Knowledge

Xiao, Mingyu, Chen, Runze, Luo, Haiyong, Zhao, Fang, Wang, Juan, Ma, Xuepeng

arXiv.org Artificial IntelligenceSep-18-2024

Map-free relocalization technology is crucial for applications in autonomous navigation and augmented reality, but relying on pre-built maps is often impractical. It faces significant challenges due to limitations in matching methods and the inherent lack of scale in monocular images. These issues lead to substantial rotational and metric errors and even localization failures in real-world scenarios. Large matching errors significantly impact the overall relocalization process, affecting both rotational and translational accuracy. Due to the inherent limitations of the camera itself, recovering the metric scale from a single image is crucial, as this significantly impacts the translation error. To address these challenges, we propose a map-free relocalization method enhanced by instance knowledge and depth knowledge. By leveraging instance-based matching information to improve global matching results, our method significantly reduces the possibility of mismatching across different objects. The robustness of instance knowledge across the scene helps the feature point matching model focus on relevant regions and enhance matching accuracy. Additionally, we use estimated metric depth from a single image to reduce metric errors and improve scale recovery accuracy. By integrating methods dedicated to mitigating large translational and rotational errors, our approach demonstrates superior performance in map-free relocalization techniques.

computer vision, knowledge, relocalization, (11 more...)

arXiv.org Artificial Intelligence

2408.13085

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Shandong Province (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.96)

Add feedback

FaVoR: Features via Voxel Rendering for Camera Relocalization

Polizzi, Vincenzo, Cannici, Marco, Scaramuzza, Davide, Kelly, Jonathan

arXiv.org Artificial IntelligenceSep-11-2024

Camera relocalization methods range from dense image alignment to direct camera pose regression from a query image. Among these, sparse feature matching stands out as an efficient, versatile, and generally lightweight approach with numerous applications. However, feature-based methods often struggle with significant viewpoint and appearance changes, leading to matching failures and inaccurate pose estimates. To overcome this limitation, we propose a novel approach that leverages a globally sparse yet locally dense 3D representation of 2D features. By tracking and triangulating landmarks over a sequence of frames, we construct a sparse voxel map optimized to render image patch descriptors observed during tracking. Given an initial pose estimate, we first synthesize descriptors from the voxels using volumetric rendering and then perform feature matching to estimate the camera pose. This methodology enables the generation of descriptors for unseen views, enhancing robustness to view changes. We extensively evaluate our method on the 7-Scenes and Cambridge Landmarks datasets. Our results show that our method significantly outperforms existing state-of-the-art feature representation techniques in indoor environments, achieving up to a 39% improvement in median translation error. Additionally, our approach yields comparable results to other methods for outdoor scenarios while maintaining lower memory and computational costs.

descriptor, landmark, representation, (16 more...)

arXiv.org Artificial Intelligence

2409.07571

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback