AITopics | Pretto, Alberto

Collaborating Authors

Pretto, Alberto

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Horticultural Temporal Fruit Monitoring via 3D Instance Segmentation and Re-Identification using Point Clouds

Fusaro, Daniel, Magistri, Federico, Behley, Jens, Pretto, Alberto, Stachniss, Cyrill

arXiv.org Artificial IntelligenceNov-12-2024

Robotic fruit monitoring is a key step toward automated agricultural production systems. Robots can significantly enhance plant and temporal fruit monitoring by providing precise, high-throughput assessments that overcome the limitations of traditional manual methods. Fruit monitoring is a challenging task due to the significant variation in size, shape, orientation, and occlusion of fruits. Also, fruits may be harvested or newly grown between recording sessions. Most methods are 2D image-based and they lack the 3D structure, depth, and spatial information, which represent key aspects of fruit monitoring. 3D colored point clouds, instead, can offer this information but they introduce challenges such as their sparsity and irregularity. In this paper, we present a novel approach for temporal fruit monitoring that addresses point clouds collected in a greenhouse over time. Our method segments fruits using a learning-based instance segmentation approach directly on the point cloud. Each segmented fruit is processed by a 3D sparse convolutional neural network to extract descriptors, which are used in an attention-based matching network to associate fruits with their instances from previous data collections. Experimental results on a real dataset of strawberries demonstrate that our approach outperforms other methods for fruits re-identification over time, allowing for precise temporal fruit monitoring in real and complex scenarios.

artificial intelligence, machine learning, point cloud, (19 more...)

arXiv.org Artificial Intelligence

2411.07799

Country: Europe (0.46)

Genre: Research Report > Promising Solution (0.34)

Industry: Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Multi-Modal 3D Scene Graph Updater for Shared and Dynamic Environments

Olivastri, Emilio, Francis, Jonathan, Pretto, Alberto, Sünderhauf, Niko, Rana, Krishan

arXiv.org Artificial IntelligenceNov-5-2024

The advent of generalist Large Language Models (LLMs) and Large Vision Models (VLMs) have streamlined the construction of semantically enriched maps that can enable robots to ground high-level reasoning and planning into their representations. One of the most widely used semantic map formats is the 3D Scene Graph, which captures both metric (low-level) and semantic (high-level) information. However, these maps often assume a static world, while real environments, like homes and offices, are dynamic. Even small changes in these spaces can significantly impact task performance. To integrate robots into dynamic environments, they must detect changes and update the scene graph in real-time. This update process is inherently multimodal, requiring input from various sources, such as human agents, the robot's own perception system, time, and its actions. This work proposes a framework that leverages these multimodal inputs to maintain the consistency of scene graphs during real-time operation, presenting promising initial results and outlining a roadmap for future research.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2411.02938

Country:

Oceania > Australia (0.14)
Europe > Netherlands (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Exploiting Local Features and Range Images for Small Data Real-Time Point Cloud Semantic Segmentation

Fusaro, Daniel, Mosco, Simone, Menegatti, Emanuele, Pretto, Alberto

arXiv.org Artificial IntelligenceOct-14-2024

Semantic segmentation of point clouds is an essential task for understanding the environment in autonomous driving and robotics. Recent range-based works achieve real-time efficiency, while point- and voxel-based methods produce better results but are affected by high computational complexity. Moreover, highly complex deep learning models are often not suited to efficiently learn from small datasets. Their generalization capabilities can easily be driven by the abundance of data rather than the architecture design. In this paper, we harness the information from the three-dimensional representation to proficiently capture local features, while introducing the range image representation to incorporate additional information and facilitate fast computation. A GPU-based KDTree allows for rapid building, querying, and enhancing projection with straightforward operations. Extensive experiments on SemanticKITTI and nuScenes datasets demonstrate the benefits of our modification in a ``small data'' setup, in which only one sequence of the dataset is used to train the models, but also in the conventional setup, where all sequences except one are used for training. We show that a reduced version of our model not only demonstrates strong competitiveness against full-scale state-of-the-art models but also operates in real-time, making it a viable choice for real-world case applications. The code of our method is available at https://github.com/Bender97/WaffleAndRange.

artificial intelligence, machine learning, point cloud, (17 more...)

arXiv.org Artificial Intelligence

2410.1051

Genre: Research Report > Promising Solution (0.34)

Industry:

Transportation > Ground > Road (0.49)
Information Technology > Robotics & Automation (0.35)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.97)
(2 more...)

Add feedback

IPC: Incremental Probabilistic Consensus-based Consistent Set Maximization for SLAM Backends

Olivastri, Emilio, Pretto, Alberto

arXiv.org Artificial IntelligenceMay-14-2024

In SLAM (Simultaneous localization and mapping) problems, Pose Graph Optimization (PGO) is a technique to refine an initial estimate of a set of poses (positions and orientations) from a set of pairwise relative measurements. The optimization procedure can be negatively affected even by a single outlier measurement, with possible catastrophic and meaningless results. Although recent works on robust optimization aim to mitigate the presence of outlier measurements, robust solutions capable of handling large numbers of outliers are yet to come. This paper presents IPC, acronym for Incremental Probabilistic Consensus, a method that approximates the solution to the combinatorial problem of finding the maximally consistent set of measurements in an incremental fashion. It evaluates the consistency of each loop closure measurement through a consensus-based procedure, possibly applied to a subset of the global problem, where all previously integrated inlier measurements have veto power. We evaluated IPC on standard benchmarks against several state-of-the-art methods. Although it is simple and relatively easy to implement, IPC competes with or outperforms the other tested methods in handling outliers while providing online performances. We release with this paper an open-source implementation of the proposed method.

artificial intelligence, optimization problem, outlier, (15 more...)

arXiv.org Artificial Intelligence

2405.08503

Country: Europe (0.14)

Genre: Research Report > Promising Solution (0.48)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

A Sonar-based AUV Positioning System for Underwater Environments with Low Infrastructure Density

Olivastri, Emilio, Fusaro, Daniel, Li, Wanmeng, Mosco, Simone, Pretto, Alberto

arXiv.org Artificial IntelligenceMay-3-2024

The increasing demand for underwater vehicles highlights the necessity for robust localization solutions in inspection missions. In this work, we present a novel real-time sonar-based underwater global positioning algorithm for AUVs (Autonomous Underwater Vehicles) designed for environments with a sparse distribution of human-made assets. Our approach exploits two synergistic data interpretation frontends applied to the same stream of sonar data acquired by a multibeam Forward-Looking Sonar (FSD). These observations are fused within a Particle Filter (PF) either to weigh more particles that belong to high-likelihood regions or to solve symmetric ambiguities. Preliminary experiments carried out on a simulated environment resembling a real underwater plant provided promising results. This work represents a starting point towards future developments of the method and consequent exhaustive evaluations also in real-world scenarios.

artificial intelligence, machine learning, sonar image, (15 more...)

arXiv.org Artificial Intelligence

2405.01971

Country: Europe (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(2 more...)

Add feedback

Improving Generalization of Synthetically Trained Sonar Image Descriptors for Underwater Place Recognition

Donadi, Ivano, Olivastri, Emilio, Fusaro, Daniel, Li, Wanmeng, Evangelista, Daniele, Pretto, Alberto

arXiv.org Artificial IntelligenceSep-24-2023

Autonomous navigation in underwater environments presents challenges due to factors such as light absorption and water turbidity, limiting the effectiveness of optical sensors. Sonar systems are commonly used for perception in underwater operations as they are unaffected by these limitations. Traditional computer vision algorithms are less effective when applied to sonar-generated acoustic images, while convolutional neural networks (CNNs) typically require large amounts of labeled training data that are often unavailable or difficult to acquire. To this end, we propose a novel compact deep sonar descriptor pipeline that can generalize to real scenarios while being trained exclusively on synthetic data. Our architecture is based on a ResNet18 back-end and a properly parameterized random Gaussian projection layer, whereas input sonar data is enhanced with standard ad-hoc normalization/prefiltering techniques. A customized synthetic data generation procedure is also presented. The proposed method has been evaluated extensively using both synthetic and publicly available real data, demonstrating its effectiveness compared to state-of-the-art methods.

artificial intelligence, descriptor, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-44137-0_28

2308.01058

Country:

South America (0.14)
Europe (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sensors for Mobile Robots

Andreasson, Henrik, Grisetti, Giorgio, Stoyanov, Todor, Pretto, Alberto

arXiv.org Artificial IntelligenceSep-7-2023

A sensor is a device that converts a physical parameter or an environmental characteristic (e.g., temperature, distance, speed, etc.) into a signal that can be digitally measured and processed to perform specific tasks. Mobile robots need sensors to measure properties of their environment, thus allowing for safe navigation, complex perception and corresponding actions, and effective interactions with other agents that populate it. Sensors used by mobile robots range from simple tactile sensors, such as bumpers, to complex vision-based sensors such as structured light RGB-D cameras. All of them provide a digital output (e.g., a string, a set of values, a matrix, etc.) that can be processed by the robot's computer. Such output is typically obtained by discretizing one or more analog electrical signals by using an Analog to Digital Converter (ADC) included in the sensor. In this chapter we present the most common sensors used in mobile robotics, providing an introduction to their taxonomy, basic features, and specifications. The description of the functionalities and the types of applications follows a bottom-up approach: the basic principles and components on which the sensors are based are presented before describing real-world sensors, which are generally based on multiple technologies and basic devices.

artificial intelligence, imaging sensor, sensor, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-642-41610-1_159-1

2206.03223

Country:

Europe > Sweden (0.14)
Europe > Italy (0.14)
North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Semiconductors & Electronics (0.47)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.84)

Add feedback

Software Architectures for Mobile Robots

Andreasson, Henrik, Grisetti, Giorgio, Stoyanov, Todor, Pretto, Alberto

arXiv.org Artificial IntelligenceSep-7-2023

Software architecture, in general, both refers to the high-level structure of a system as well as to the process of ensuring that the structure or the design of a system is according to specific needs. For mobile robotics, specific requirements are, for example, real-time capabilities, asynchronous data processing, and distributed functionality. While there is a clear distinction between a design of a software architecture suitable for robotics and the particular reference design implementation, in practice, due to the complexity of the task, frameworks for robotics often come with a single reference implementation. Therefore, when comparing and choosing an appropriate software architecture, it is prudent to take into consideration not only the design but the suitability of the implementation as well. This chapter appears in: Ang, M.H., Khatib, O., Siciliano, B. (eds) Encyclopedia of Robotics. For a researcher the design and implementation of such system is usually a "necessary evil", as it is required in order to deploy subsequently developed research code. Only with respect to data logging, a plethora of different formats for storing sensory data have been proposed and used by the community, each necessitating its own set of data parsing tools and interfaces to convert to alternative formats. Optimal design of architectures suitable to the needs of a mobile robot system is a research topic on its own right, but the vast majority of researchers in the field are typically users of the middleware system, instead of active developers. The core idea is to separate the application into reusable components.

architecture, artificial intelligence, communication, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-642-41610-1_160-1

2206.03233

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Information Technology (0.48)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.65)

Add feedback

A Graph-based Optimization Framework for Hand-Eye Calibration for Multi-Camera Setups

Evangelista, Daniele, Olivastri, Emilio, Allegro, Davide, Menegatti, Emanuele, Pretto, Alberto

arXiv.org Artificial IntelligenceJul-28-2023

Hand-eye calibration is the problem of estimating the spatial transformation between a reference frame, usually the base of a robot arm or its gripper, and the reference frame of one or multiple cameras. Generally, this calibration is solved as a non-linear optimization problem, what instead is rarely done is to exploit the underlying graph structure of the problem itself. Actually, the problem of hand-eye calibration can be seen as an instance of the Simultaneous Localization and Mapping (SLAM) problem. Inspired by this fact, in this work we present a pose-graph approach to the hand-eye calibration problem that extends a recent state-of-the-art solution in two different ways: i) by formulating the solution to eye-on-base setups with one camera; ii) by covering multi-camera robotic setups. The proposed approach has been validated in simulation against standard hand-eye calibration methods. Moreover, a real application is shown. In both scenarios, the proposed approach overcomes all alternative methods. We release with this paper an open-source implementation of our graph-based optimization framework for multi-camera setups.

artificial intelligence, calibration, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICRA48891.2023.10160758

2303.04747

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

KVN: Keypoints Voting Network with Differentiable RANSAC for Stereo Pose Estimation

Donadi, Ivano, Pretto, Alberto

arXiv.org Artificial IntelligenceJul-21-2023

Object pose estimation is a fundamental computer vision task exploited in several robotics and augmented reality applications. Many established approaches rely on predicting 2D-3D keypoint correspondences using RANSAC (Random sample consensus) and estimating the object pose using the PnP (Perspective-n-Point) algorithm. Being RANSAC non-differentiable, correspondences cannot be directly learned in an end-to-end fashion. In this paper, we address the stereo image-based object pose estimation problem by (i) introducing a differentiable RANSAC layer into a well-known monocular pose estimation network; (ii) exploiting an uncertainty-driven multi-view PnP solver which can fuse information from multiple views. We evaluate our approach on a challenging public stereo object pose estimation dataset, yielding state-of-the-art results against other recent approaches. Furthermore, in our ablation study, we show that the differentiable RANSAC layer plays a significant role in the accuracy of the proposed method. We release with this paper the open-source implementation of our method.

artificial intelligence, pose estimation, video understanding, (16 more...)

arXiv.org Artificial Intelligence

2307.11543

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback