AITopics | Cadena, Cesar

Plotting

Cadena, Cesar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Keypoint Semantic Integration for Improved Feature Matching in Outdoor Agricultural Environments

de Silva, Rajitha, Cox, Jonathan, Popovic, Marija, Cadena, Cesar, Stachniss, Cyrill, Polvara, Riccardo

arXiv.org Artificial IntelligenceMar-11-2025

Robust robot navigation in outdoor environments requires accurate perception systems capable of handling visual challenges such as repetitive structures and changing appearances. Visual feature matching is crucial to vision-based pipelines but remains particularly challenging in natural outdoor settings due to perceptual aliasing. We address this issue in vineyards, where repetitive vine trunks and other natural elements generate ambiguous descriptors that hinder reliable feature matching. We hypothesise that semantic information tied to keypoint positions can alleviate perceptual aliasing by enhancing keypoint descriptor distinctiveness. To this end, we introduce a keypoint semantic integration technique that improves the descriptors in semantically meaningful regions within the image, enabling more accurate differentiation even among visually similar local features. We validate this approach in two vineyard perception tasks: (i) relative pose estimation and (ii) visual localisation. Across all tested keypoint types and descriptors, our method improves matching accuracy by 12.6%, demonstrating its effectiveness over multiple months in challenging vineyard conditions.

descriptor, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2503.08843

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.49)
(3 more...)

Add feedback

ForestLPR: LiDAR Place Recognition in Forests Attentioning Multiple BEV Density Images

Shen, Yanqing, Tuna, Turcan, Hutter, Marco, Cadena, Cesar, Zheng, Nanning

arXiv.org Artificial IntelligenceMar-6-2025

Place recognition is essential to maintain global consistency in large-scale localization systems. While research in urban environments has progressed significantly using LiDARs or cameras, applications in natural forest-like environments remain largely under-explored. Furthermore, forests present particular challenges due to high self-similarity and substantial variations in vegetation growth over time. In this work, we propose a robust LiDAR-based place recognition method for natural forests, ForestLPR. We hypothesize that a set of cross-sectional images of the forest's geometry at different heights contains the information needed to recognize revisiting a place. The cross-sectional images are represented by \ac{bev} density images of horizontal slices of the point cloud at different heights. Our approach utilizes a visual transformer as the shared backbone to produce sets of local descriptors and introduces a multi-BEV interaction module to attend to information at different heights adaptively. It is followed by an aggregation layer that produces a rotation-invariant place descriptor. We evaluated the efficacy of our method extensively on real-world data from public benchmarks as well as robotic datasets and compared it against the state-of-the-art (SOTA) methods. The results indicate that ForestLPR has consistently good performance on all evaluations and achieves an average increase of 7.38\% and 9.11\% on Recall@1 over the closest competitor on intra-sequence loop closure detection and inter-sequence re-localization, respectively, validating our hypothesis

artificial intelligence, machine learning, point cloud, (15 more...)

arXiv.org Artificial Intelligence

2503.04475

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Enhancing Robotic Precision in Construction: A Modular Factor Graph-Based Framework to Deflection and Backlash Compensation Using High-Accuracy Accelerometers

Kindle, Julien, Loetscher, Michael, Alessandretti, Andrea, Cadena, Cesar, Hutter, Marco

arXiv.org Artificial IntelligenceJan-24-2025

Accurate positioning is crucial in the construction industry, where labor shortages highlight the need for automation. Robotic systems with long kinematic chains are required to reach complex workspaces, including floors, walls, and ceilings. These requirements significantly impact positioning accuracy due to effects such as deflection and backlash in various parts along the kinematic chain. In this work, we introduce a novel approach that integrates deflection and backlash compensation models with high-accuracy accelerometers, significantly enhancing position accuracy. Our method employs a modular framework based on a factor graph formulation to estimate the state of the kinematic chain, leveraging acceleration measurements to inform the model. Extensive testing on publicly released datasets, reflecting real-world construction disturbances, demonstrates the advantages of our approach. The proposed method reduces the $95\%$ error threshold in the xy-plane by $50\%$ compared to the state-of-the-art Virtual Joint Method, and by $31\%$ when incorporating base tilt compensation.

accelerometer, artificial intelligence, deflection, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2024.3506276

2501.1428

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Industry: Construction & Engineering (0.55)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

FrontierNet: Learning Visual Cues to Explore

Sun, Boyang, Chen, Hanzhi, Leutenegger, Stefan, Cadena, Cesar, Pollefeys, Marc, Blum, Hermann

arXiv.org Artificial IntelligenceJan-8-2025

Exploration of unknown environments is crucial for autonomous robots; it allows them to actively reason and decide on what new data to acquire for tasks such as mapping, object discovery, and environmental assessment. Existing methods, such as frontier-based methods, rely heavily on 3D map operations, which are limited by map quality and often overlook valuable context from visual cues. This work aims at leveraging 2D visual cues for efficient autonomous exploration, addressing the limitations of extracting goal poses from a 3D map. We propose a image-only frontier-based exploration system, with FrontierNet as a core component developed in this work. FrontierNet is a learning-based model that (i) detects frontiers, and (ii) predicts their information gain, from posed RGB images enhanced by monocular depth priors. Our approach provides an alternative to existing 3D-dependent exploration systems, achieving a 16% improvement in early-stage exploration efficiency, as validated through extensive simulations and real-world experiments.

frontier, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.04597

Country: Europe (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Continuous-Time State Estimation Methods in Robotics: A Survey

Talbot, William, Nubert, Julian, Tuna, Turcan, Cadena, Cesar, Dümbgen, Frederike, Tordesillas, Jesus, Barfoot, Timothy D., Hutter, Marco

arXiv.org Artificial IntelligenceNov-6-2024

Accurate, efficient, and robust state estimation is more important than ever in robotics as the variety of platforms and complexity of tasks continue to grow. Historically, discrete-time filters and smoothers have been the dominant approach, in which the estimated variables are states at discrete sample times. The paradigm of continuous-time state estimation proposes an alternative strategy by estimating variables that express the state as a continuous function of time, which can be evaluated at any query time. Not only can this benefit downstream tasks such as planning and control, but it also significantly increases estimator performance and flexibility, as well as reduces sensor preprocessing and interfacing complexity. Despite this, continuous-time methods remain underutilized, potentially because they are less well-known within robotics. To remedy this, this work presents a unifying formulation of these methods and the most exhaustive literature review to date, systematically categorizing prior work by methodology, application, state variables, historical context, and theoretical contribution to the field. By surveying splines and Gaussian processes together and contextualizing works from other research domains, this work identifies and analyzes open problems in continuous-time state estimation and suggests new research directions.

artificial intelligence, optimization problem, survey article, (18 more...)

arXiv.org Artificial Intelligence

2411.03951

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England (0.14)

Genre:

Overview (0.65)
Research Report (0.50)

Industry:

Transportation (1.00)
Leisure & Entertainment (0.92)
Information Technology > Robotics & Automation (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

IPPON: Common Sense Guided Informative Path Planning for Object Goal Navigation

Qu, Kaixian, Tan, Jie, Zhang, Tingnan, Xia, Fei, Cadena, Cesar, Hutter, Marco

arXiv.org Artificial IntelligenceOct-25-2024

Navigating efficiently to an object in an unexplored environment is a critical skill for general-purpose intelligent robots. Recent approaches to this object goal navigation problem have embraced a modular strategy, integrating classical exploration algorithms-notably frontier exploration-with a learned semantic mapping/exploration module. This paper introduces a novel informative path planning and 3D object probability mapping approach. The mapping module computes the probability of the object of interest through semantic segmentation and a Bayes filter. Additionally, it stores probabilities for common objects, which semantically guides the exploration based on common sense priors from a large language model. The planner terminates when the current viewpoint captures enough voxels identified with high confidence as the object of interest. Although our planner follows a zero-shot approach, it achieves state-of-the-art performance as measured by the Success weighted by Path Length (SPL) and Soft SPL in the Habitat ObjectNav Challenge 2023, outperforming other works by more than 20%. Furthermore, we validate its effectiveness on real robots. Project webpage: https://ippon-paper.github.io/

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.19697

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.72)

Add feedback

Tag Map: A Text-Based Map for Spatial Reasoning and Navigation with Large Language Models

Zhang, Mike, Qu, Kaixian, Patil, Vaishakh, Cadena, Cesar, Hutter, Marco

arXiv.org Artificial IntelligenceSep-23-2024

Large Language Models (LLM) have emerged as a tool for robots to generate task plans using common sense reasoning. For the LLM to generate actionable plans, scene context must be provided, often through a map. Recent works have shifted from explicit maps with fixed semantic classes to implicit open vocabulary maps based on queryable embeddings capable of representing any semantic class. However, embeddings cannot directly report the scene context as they are implicit, requiring further processing for LLM integration. To address this, we propose an explicit text-based map that can represent thousands of semantic classes while easily integrating with LLMs due to their text-based nature by building upon large-scale image recognition models. We study how entities in our map can be localized and show through evaluations that our text-based map localizations perform comparably to those from open vocabulary maps while using two to four orders of magnitude less memory. Real-robot experiments demonstrate the grounding of an LLM with the text-based map to solve user tasks.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2409.15451

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Mower, Christopher E., Wan, Yuhui, Yu, Hongzhan, Grosnit, Antoine, Gonzalez-Billandon, Jonas, Zimmer, Matthieu, Wang, Jinlong, Zhang, Xinyu, Zhao, Yao, Zhai, Anbang, Liu, Puze, Palenicek, Daniel, Tateo, Davide, Cadena, Cesar, Hutter, Marco, Peters, Jan, Tian, Guangjian, Zhuang, Yuzheng, Shao, Kun, Quan, Xingyue, Hao, Jianye, Wang, Jun, Bou-Ammar, Haitham

arXiv.org Artificial IntelligenceJul-12-2024

We present a framework for intuitive robot programming by non-experts, leveraging natural language prompts and contextual information from the Robot Operating System (ROS). Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface. Key features of the framework include: integration of ROS with an AI agent connected to a plethora of open-source and commercial LLMs, automatic extraction of a behavior from the LLM output and execution of ROS actions/services, support for three behavior modes (sequence, behavior tree, state machine), imitation learning for adding new robot actions to the library of possible actions, and LLM reflection via human and environment feedback. Extensive experiments validate the framework, showcasing robustness, scalability, and versatility in diverse scenarios, including long-horizon tasks, tabletop rearrangements, and remote supervisory control. To facilitate the adoption of our framework and support the reproduction of our results, we have made our code open-source. You can access it at: https://github.com/huawei-noah/HEBO/tree/master/ROSLLM.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2406.19741

Country:

Asia > China (0.68)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Japan > Honshū (0.14)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Wild Visual Navigation: Fast Traversability Learning via Pre-Trained Models and Online Self-Supervision

Mattamala, Matías, Frey, Jonas, Libera, Piotr, Chebrolu, Nived, Martius, Georg, Cadena, Cesar, Hutter, Marco, Fallon, Maurice

arXiv.org Artificial IntelligenceApr-10-2024

Natural environments such as forests and grasslands are challenging for robotic navigation because of the false perception of rigid obstacles from high grass, twigs, or bushes. In this work, we present Wild Visual Navigation (WVN), an online self-supervised learning system for visual traversability estimation. The system is able to continuously adapt from a short human demonstration in the field, only using onboard sensing and computing. One of the key ideas to achieve this is the use of high-dimensional features from pre-trained self-supervised models, which implicitly encode semantic information that massively simplifies the learning task. Further, the development of an online scheme for supervision generator enables concurrent training and inference of the learned model in the wild. We demonstrate our approach through diverse real-world deployments in forests, parks, and grasslands. Our system is able to bootstrap the traversable terrain segmentation in less than 5 min of in-field training time, enabling the robot to navigate in complex, previously unseen outdoor terrains. Code: https://bit.ly/498b0CV - Project page:https://bit.ly/3M6nMHH

artificial intelligence, machine learning, traversability, (15 more...)

arXiv.org Artificial Intelligence

2404.0711

Country:

North America > United States (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Baden-Württemberg (0.14)
Europe > United Kingdom > England > Oxfordshire (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

VIRUS-NeRF -- Vision, InfraRed and UltraSonic based Neural Radiance Fields

Schmid, Nicolaj, von Einem, Cornelius, Cadena, Cesar, Siegwart, Roland, Hruby, Lorenz, Tschopp, Florian

arXiv.org Artificial IntelligenceMar-14-2024

Autonomous mobile robots are an increasingly integral part of modern factory and warehouse operations. Obstacle detection, avoidance and path planning are critical safety-relevant tasks, which are often solved using expensive LiDAR sensors and depth cameras. We propose to use cost-effective low-resolution ranging sensors, such as ultrasonic and infrared time-of-flight sensors by developing VIRUS-NeRF - Vision, InfraRed, and UltraSonic based Neural Radiance Fields. Building upon Instant Neural Graphics Primitives with a Multiresolution Hash Encoding (Instant-NGP), VIRUS-NeRF incorporates depth measurements from ultrasonic and infrared sensors and utilizes them to update the occupancy grid used for ray marching. Experimental evaluation in 2D demonstrates that VIRUS-NeRF achieves comparable mapping performance to LiDAR point clouds regarding coverage. Notably, in small environments, its accuracy aligns with that of LiDAR measurements, while in larger ones, it is bounded by the utilized ultrasonic sensors. An in-depth ablation study reveals that adding ultrasonic and infrared sensors is highly effective when dealing with sparse data and low view variation. Further, the proposed occupancy grid of VIRUS-NeRF improves the mapping capabilities and increases the training speed by 46% compared to Instant-NGP. Overall, VIRUS-NeRF presents a promising approach for cost-effective local mapping in mobile robotics, with potential applications in safety and navigation tasks. The code can be found at https://github.com/ethz-asl/virus nerf.

artificial intelligence, machine learning, virus-nerf, (17 more...)

arXiv.org Artificial Intelligence

2403.09477

Country:

North America > United States (0.31)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback