Goto

Collaborating Authors

 ovc




POAM: Probabilistic Online Attentive Mapping for Efficient Robotic Information Gathering

arXiv.org Artificial Intelligence

Gaussian Process (GP) models are widely used for Robotic Information Gathering (RIG) in exploring unknown environments due to their ability to model complex phenomena with non-parametric flexibility and accurately quantify prediction uncertainty. Previous work has developed informative planners and adaptive GP models to enhance the data efficiency of RIG by improving the robot's sampling strategy to focus on informative regions in non-stationary environments. However, computational efficiency becomes a bottleneck when using GP models in large-scale environments with limited computational resources. We propose a framework -- Probabilistic Online Attentive Mapping (POAM) -- that leverages the modeling strengths of the non-stationary Attentive Kernel while achieving constant-time computational complexity for online decision-making. POAM guides the optimization process via variational Expectation Maximization, providing constant-time update rules for inducing inputs, variational parameters, and hyperparameters. Extensive experiments in active bathymetric mapping tasks demonstrate that POAM significantly improves computational efficiency, model accuracy, and uncertainty quantification capability compared to existing online sparse GP models.


Blockchain-based traffic management for Advanced Air Mobility

arXiv.org Artificial Intelligence

The large public interest in Advanced Air Mobility (AAM) will soon lead to congested skies overhead cities, analogously to what happened with other transportation means, including commercial aviation. In the latter case, the combination of large distances and demanded number flights is such that a system with centralized control, with most of the decisions made by human operators, is safe. However, for AAM, it is expected a much higher demand, because it will be used for people's daily commutes. Thus, higher automation levels will become a requirement for coordinating this traffic, which might not be effectively managed by humans. The establishment of fixed air routes can abate complexity, however at the cost of limiting capacity and decreasing efficiency. Another alternative is the use of a powerful central system based on Artificial Intelligence (AI), which would allow flexible trajectories and higher efficiency. However, such system would require concentrated investment, could contain Single-Points-of-Failure (SPoFs), would be a highly sought target of malicious attacks, and would be subject to periods of unavailability. This work proposes a new technology that solves the problem of managing the high complexity of the AAM traffic with a secure distributed approach, without the need for a proprietary centralized automation system. This technology enables distributed airspace allocation management and conflict resolution by means of trusted shared data structures and associated smart contracts running on a blockchain ecosystem. This way, it greatly reduces the risk of system outages due to SPoFs, by allowing peer-to-peer conflict resolution, and being more resilient to failures in the ground communication infrastructure. Furthermore, it provides priority-based balancing mechanisms that help to regulate fairness among participants in the utilization of the airspace.


Efficient Object-Level Visual Context Modeling for Multimodal Machine Translation: Masking Irrelevant Objects Helps Grounding

arXiv.org Artificial Intelligence

Visual context provides grounding information for multimodal machine translation (MMT). However, previous MMT models and probing studies on visual features suggest that visual information is less explored in MMT as it is often redundant to textual information. In this paper, we propose an object-level visual context modeling framework (OVC) to efficiently capture and explore visual information for multimodal machine translation. With detected objects, the proposed OVC encourages MMT to ground translation on desirable visual objects by masking irrelevant objects in the visual modality. We equip the proposed with an additional object-masking loss to achieve this goal. The object-masking loss is estimated according to the similarity between masked objects and the source texts so as to encourage masking source-irrelevant objects. Additionally, in order to generate vision-consistent target words, we further propose a vision-weighted translation loss for OVC. Experiments on MMT datasets demonstrate that the proposed OVC model outperforms state-of-the-art MMT models and analyses show that masking irrelevant objects helps grounding in MMT.


A Constraint Programming Approach to Simultaneous Task Allocation and Motion Scheduling for Industrial Dual-Arm Manipulation Tasks

arXiv.org Artificial Intelligence

Modern lightweight dual-arm robots bring the physical capabilities to quickly take over tasks at typical industrial workplaces designed for workers. In times of mass-customization, low setup times including the instructing/specifying of new tasks are crucial to stay competitive. We propose a constraint programming approach to simultaneous task allocation and motion scheduling for such industrial manipulation and assembly tasks. The proposed approach covers dual-arm and even multi-arm robots as well as connected machines. The key concept are Ordered Visiting Constraints, a descriptive and extensible model to specify such tasks with their spatiotemporal requirements and task-specific combinatorial or ordering constraints. Our solver integrates such task models and robot motion models into constraint optimization problems and solves them efficiently using various heuristics to produce makespan-optimized robot programs. The proposed task model is robot independent and thus can easily be deployed to other robotic platforms. Flexibility and portability of our proposed model is validated through several experiments on different simulated robot platforms. We benchmarked our search strategy against a general-purpose heuristic. For large manipulation tasks with 200 objects, our solver implemented using Google's Operations Research tools and ROS requires less than a minute to compute usable plans.