AITopics | Object-Oriented Architecture

Collaborating Authors

Object-Oriented Architecture

News Overviews Instructional Materials AI-Alerts Classics

Preferential Multi-Target Search in Indoor Environments using Semantic SLAM

Chikhalikar, Akash, Ravankar, Ankit A., Luces, Jose Victorio Salazar, Hirata, Yasuhisa

arXiv.org Artificial IntelligenceSep-26-2023

In recent years, the demand for service robots capable of executing tasks beyond autonomous navigation has grown. In the future, service robots will be expected to perform complex tasks like 'Set table for dinner'. High-level tasks like these, require, among other capabilities, the ability to retrieve multiple targets. This paper delves into the challenge of locating multiple targets in an environment, termed 'Find my Objects.' We present a novel heuristic designed to facilitate robots in conducting a preferential search for multiple targets in indoor spaces. Our approach involves a Semantic SLAM framework that combines semantic object recognition with geometric data to generate a multi-layered map. We fuse the semantic maps with probabilistic priors for efficient inferencing. Recognizing the challenges introduced by obstacles that might obscure a navigation goal and render standard point-to-point navigation strategies less viable, our methodology offers resilience to such factors. Importantly, our method is adaptable to various object detectors, RGB-D SLAM techniques, and local navigation planners. We demonstrate the 'Find my Objects' task in real-world indoor environments, yielding quantitative results that attest to the effectiveness of our methodology. This strategy can be applied in scenarios where service robots need to locate, grasp, and transport objects, taking into account user preferences. For a brief summary, please refer to our video: https://tinyurl.com/PrefTargetSearch

information, navigation, robot, (14 more...)

arXiv.org Artificial Intelligence

2309.14063

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Japan > Honshū > Tōhoku > Miyagi Prefecture > Sendai (0.04)
Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)

Genre:

Research Report (1.00)
Overview (0.74)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Object-oriented mapping in dynamic environments

Pekkanen, Matti, Verdoja, Francesco, Kyrki, Ville

arXiv.org Artificial IntelligenceSep-15-2023

Grid maps, especially occupancy grid maps, are ubiquitous in many mobile robot applications. To simplify the process of learning the map, grid maps subdivide the world into a grid of cells, whose occupancies are independently estimated using only measurements in the perceptual field of the particular cell. However, the world consists of objects that span multiple cells, which means that measurements falling onto a cell provide evidence on the occupancy of other cells belonging to the same object. This correlation is not captured by current models. In this work, we present a way to generalize the update of grid maps relaxing the assumption of independence by modeling the relationship between the measurements and the occupancy of each cell as a set of latent variables, and jointly estimating those variables and the posterior of the map. Additionally, we propose a method to estimate the latent variables by clustering based on semantic labels and an extension to the Normal Distributions Transfer Occupancy Map (NDT-OM) to facilitate the proposed map update method. We perform comprehensive experiments of map creation and localization with real world data sets, and show that the proposed method creates better maps in highly dynamic environments compared to state-of-the-art methods. Finally, we demonstrate the ability of the proposed method to remove occluded objects from the map in a lifelong map update scenario.

experiment, ndt-om, occupancy, (15 more...)

arXiv.org Artificial Intelligence

2309.08324

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > Portugal (0.04)
Europe > Finland (0.04)
(5 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.41)

Add feedback

Building a Winning Team: Selecting Source Model Ensembles using a Submodular Transferability Estimation Approach

B, Vimal K, Bachu, Saketh, Garg, Tanmay, Narasimhan, Niveditha Lakshmi, Konuru, Raghavan, Balasubramanian, Vineeth N

arXiv.org Artificial IntelligenceSep-5-2023

Estimating the transferability of publicly available pretrained models to a target task has assumed an important place for transfer learning tasks in recent years. Existing efforts propose metrics that allow a user to choose one model from a pool of pre-trained models without having to fine-tune each model individually and identify one explicitly. With the growth in the number of available pre-trained models and the popularity of model ensembles, it also becomes essential to study the transferability of multiple-source models for a given target task. The few existing efforts study transferability in such multi-source ensemble settings using just the outputs of the classification layer and neglect possible domain or task mismatch. Moreover, they overlook the most important factor while selecting the source models, viz., the cohesiveness factor between them, which can impact the performance and confidence in the prediction of the ensemble. To address these gaps, we propose a novel Optimal tranSport-based suBmOdular tRaNsferability metric (OSBORN) to estimate the transferability of an ensemble of models to a downstream task. OSBORN collectively accounts for image domain difference, task difference, and cohesiveness of models in the ensemble to provide reliable estimates of transferability. We gauge the performance of OSBORN on both image classification and semantic segmentation tasks. Our setup includes 28 source datasets, 11 target datasets, 5 model architectures, and 2 pre-training methods. We benchmark our method against current state-of-the-art metrics MS-LEEP and E-LEEP, and outperform them consistently using the proposed approach.

computer vision, dataset, ensemble, (14 more...)

arXiv.org Artificial Intelligence

2309.02429

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(7 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.67)

Add feedback

Grasp Transfer based on Self-Aligning Implicit Representations of Local Surfaces

Tekden, Ahmet, Deisenroth, Marc Peter, Bekiroglu, Yasemin

arXiv.org Artificial IntelligenceAug-15-2023

Objects we interact with and manipulate often share similar parts, such as handles, that allow us to transfer our actions flexibly due to their shared functionality. This work addresses the problem of transferring a grasp experience or a demonstration to a novel object that shares shape similarities with objects the robot has previously encountered. Existing approaches for solving this problem are typically restricted to a specific object category or a parametric shape. Our approach, however, can transfer grasps associated with implicit models of local surfaces shared across object categories. Specifically, we employ a single expert grasp demonstration to learn an implicit local surface representation model from a small dataset of object meshes. At inference time, this model is used to transfer grasps to novel objects by identifying the most geometrically similar surfaces to the one on which the expert grasp is demonstrated. Our model is trained entirely in simulation and is evaluated on simulated and real-world objects that are not seen during training. Evaluations indicate that grasp transfer to unseen object categories using this approach can be successfully performed both in simulation and real-world experiments. The simulation results also show that the proposed approach leads to better spatial precision and grasp accuracy compared to a baseline approach.

artificial intelligence, machine learning, object-oriented architecture, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2023.3306272

2308.07807

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Sweden (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Active Metric-Semantic Mapping by Multiple Aerial Robots

Liu, Xu, Prabhu, Ankit, Cladera, Fernando, Miller, Ian D., Zhou, Lifeng, Taylor, Camillo J., Kumar, Vijay

arXiv.org Artificial IntelligenceAug-13-2023

Traditional approaches for active mapping focus on building geometric maps. For most real-world applications, however, actionable information is related to semantically meaningful objects in the environment. We propose an approach to the active metric-semantic mapping problem that enables multiple heterogeneous robots to collaboratively build a map of the environment. The robots actively explore to minimize the uncertainties in both semantic (object classification) and geometric (object modeling) information. We represent the environment using informative but sparse object models, each consisting of a basic shape and a semantic class label, and characterize uncertainties empirically using a large amount of real-world data. Given a prior map, we use this model to select actions for each robot to minimize uncertainties. The performance of our algorithm is demonstrated through multi-robot experiments in diverse real-world environments. The proposed framework is applicable to a wide range of real-world problems, such as precision agriculture, infrastructure inspection, and asset mapping in factories. A demo video can be found at https://youtu.be/S86SgXi54oU.

mapping, natural language, object-oriented architecture, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICRA48891.2023.10161564

2209.08465

Country: North America > United States > Pennsylvania (0.04)

Genre: Research Report (0.64)

Industry: Food & Agriculture > Agriculture (0.54)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.49)

Add feedback

Object Goal Navigation with Recursive Implicit Maps

Chen, Shizhe, Chabal, Thomas, Laptev, Ivan, Schmid, Cordelia

arXiv.org Artificial IntelligenceAug-10-2023

Object goal navigation aims to navigate an agent to locations of a given object category in unseen environments. Classical methods explicitly build maps of environments and require extensive engineering while lacking semantic information for object-oriented exploration. On the other hand, end-to-end learning methods alleviate manual map design and predict actions using implicit representations. Such methods, however, lack an explicit notion of geometry and may have limited ability to encode navigation history. In this work, we propose an implicit spatial map for object goal navigation. Our implicit map is recursively updated with new observations at each step using a transformer. To encourage spatial reasoning, we introduce auxiliary tasks and train our model to reconstruct explicit maps as well as to predict visual features, semantic labels and actions. Our method significantly outperforms the state of the art on the challenging MP3D dataset and generalizes well to the HM3D dataset. We successfully deploy our model on a real robot and achieve encouraging object goal navigation results in real scenes using only a few real-world demonstrations. Code, trained models and videos are available at \url{https://www.di.ens.fr/willow/research/onav_rim/}.

machine learning, navigation, object-oriented architecture, (20 more...)

arXiv.org Artificial Intelligence

2308.05602

Country: Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.55)

Add feedback

SC3K: Self-supervised and Coherent 3D Keypoints Estimation from Rotated, Noisy, and Decimated Point Cloud Data

Zohaib, Mohammad, Del Bue, Alessio

arXiv.org Artificial IntelligenceAug-10-2023

This paper proposes a new method to infer keypoints from arbitrary object categories in practical scenarios where point cloud data (PCD) are noisy, down-sampled and arbitrarily rotated. Our proposed model adheres to the following principles: i) keypoints inference is fully unsupervised (no annotation given), ii) keypoints position error should be low and resilient to PCD perturbations (robustness), iii) keypoints should not change their indexes for the intra-class objects (semantic coherence), iv) keypoints should be close to or proximal to PCD surface (compactness). We achieve these desiderata by proposing a new self-supervised training strategy for keypoints estimation that does not assume any a priori knowledge of the object class, and a model architecture with coupled auxiliary losses that promotes the desired keypoints properties. We compare the keypoints estimated by the proposed approach with those of the state-of-the-art unsupervised approaches. The experiments show that our approach outperforms by estimating keypoints with improved coverage (+9.41%) while being semantically consistent (+4.66%) that best characterizes the object's 3D shape for downstream tasks. Code and data are available at: https://github.com/IITPAVIS/SC3K

artificial intelligence, machine learning, object-oriented architecture, (18 more...)

arXiv.org Artificial Intelligence

2308.0541

Country:

North America > Canada > Newfoundland and Labrador > Labrador (0.04)
Europe > Italy > Liguria > Genoa (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.88)

Add feedback

HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose Annotations, Affordances, and Reconstructions

Guo, Andrew, Wen, Bowen, Yuan, Jianhe, Tremblay, Jonathan, Tyree, Stephen, Smith, Jeffrey, Birchfield, Stan

arXiv.org Artificial IntelligenceAug-2-2023

We present the HANDAL dataset for category-level object pose estimation and affordance prediction. Unlike previous datasets, ours is focused on robotics-ready manipulable objects that are of the proper size and shape for functional grasping by robot manipulators, such as pliers, utensils, and screwdrivers. Our annotation process is streamlined, requiring only a single off-the-shelf camera and semi-automated processing, allowing us to produce high-quality 3D annotations without crowd-sourcing. The dataset consists of 308k annotated image frames from 2.2k videos of 212 real-world objects in 17 categories. We focus on hardware and kitchen tool objects to facilitate research in practical scenarios in which a robot manipulator needs to interact with the environment beyond simple pushing or indiscriminate grasping. We outline the usefulness of our dataset for 6-DoF category-level pose+scale estimation and related tasks. We also provide 3D reconstructed meshes of all objects, and we outline some of the bottlenecks to be addressed for democratizing the collection of datasets like this one.

artificial intelligence, machine learning, object-oriented architecture, (19 more...)

arXiv.org Artificial Intelligence

2308.01477

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.41)
(2 more...)

Add feedback

Unsupervised Deep Graph Matching Based on Cycle Consistency

Tourani, Siddharth, Rother, Carsten, Khan, Muhammad Haris, Savchynskyy, Bogdan

arXiv.org Artificial IntelligenceAug-1-2023

We contribute to the sparsely populated area of unsupervised deep graph matching with application to keypoint matching in images. Contrary to the standard \emph{supervised} approach, our method does not require ground truth correspondences between keypoint pairs. Instead, it is self-supervised by enforcing consistency of matchings between images of the same object category. As the matching and the consistency loss are discrete, their derivatives cannot be straightforwardly used for learning. We address this issue in a principled way by building our method upon the recent results on black-box differentiation of combinatorial solvers. This makes our method exceptionally flexible, as it is compatible with arbitrary network architectures and combinatorial solvers. Our experimental evaluation suggests that our technique sets a new state-of-the-art for unsupervised graph matching.

artificial intelligence, machine learning, object-oriented architecture, (19 more...)

arXiv.org Artificial Intelligence

2307.0893

Country:

North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.34)

Add feedback

Part-level Scene Reconstruction Affords Robot Interaction

Zhang, Zeyu, Zhang, Lexing, Wang, Zaijin, Jiao, Ziyuan, Han, Muzhi, Zhu, Yixin, Zhu, Song-Chun, Liu, Hangxin

arXiv.org Artificial IntelligenceJul-31-2023

Existing methods for reconstructing interactive scenes primarily focus on replacing reconstructed objects with CAD models retrieved from a limited database, resulting in significant discrepancies between the reconstructed and observed scenes. To address this issue, our work introduces a part-level reconstruction approach that reassembles objects using primitive shapes. This enables us to precisely replicate the observed physical scenes and simulate robot interactions with both rigid and articulated objects. By segmenting reconstructed objects into semantic parts and aligning primitive shapes to these parts, we assemble them as CAD models while estimating kinematic relations, including parent-child contact relations, joint types, and parameters. Specifically, we derive the optimal primitive alignment by solving a series of optimization problems, and estimate kinematic relations based on part semantics and geometry. Our experiments demonstrate that part-level scene reconstruction outperforms object-level reconstruction by accurately capturing finer details and improving precision. These reconstructed part-level interactive scenes provide valuable kinematic information for various robotic applications; we showcase the feasibility of certifying mobile manipulation planning in these interactive scenes before executing tasks in the physical world.

artificial intelligence, object-oriented architecture, relation, (18 more...)

arXiv.org Artificial Intelligence

2307.1642

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.46)

Add feedback