Pirri, Fiora
Removing supervision in semantic segmentation with local-global matching and area balancing
Rossetti, Simone, Samà, Nico, Pirri, Fiora
Removing supervision in semantic segmentation is still tricky. Current approaches can deal with common categorical patterns yet resort to multi-stage architectures. We design a novel end-to-end model leveraging local-global patch matching to predict categories, good localization, area and shape of objects for semantic segmentation. The local-global matching is, in turn, compelled by optimal transport plans fulfilling area constraints nearing a solution for exact shape prediction. Our model attains state-of-the-art in Weakly Supervised Semantic Segmentation, only image-level labels, with 75% mIoU on PascalVOC2012 val set and 46% on MS-COCO2014 val set. Dropping the image-level labels and clustering self-supervised learned features to yield pseudo-multi-level labels, we obtain an unsupervised model for semantic segmentation. We also attain state-of-the-art on Unsupervised Semantic Segmentation with 43.6% mIoU on PascalVOC2012 val set and 19.4% on MS-COCO2014 val set. The code is available at https://github.com/deepplants/PC2M.
Spatio-Temporal SAR-Optical Data Fusion for Cloud Removal via a Deep Hierarchical Model
Sebastianelli, Alessandro, Nowakowski, Artur, Puglisi, Erika, Del Rosso, Maria Pia, Mifdal, Jamila, Pirri, Fiora, Mathieu, Pierre Philippe, Ullo, Silvia Liberata
The abundance of clouds, located both spatially and temporally, often makes remote sensing (RS) applications with optical images difficult or even impossible to perform. Traditional cloud removing techniques have been studied for years, and recently, Machine Learning (ML)-based approaches have also been considered. In this manuscript, a novel method for the restoration of clouds-corrupted optical images is presented, able to generate the whole optical scene of interest, not only the cloudy pixels, and based on a Joint Data Fusion paradigm, where three deep neural networks are hierarchically combined. Spatio-temporal features are separately extracted by a conditional Generative Adversarial Network (cGAN) and by a Convolutional Long Short-Term Memory (ConvLSTM), from Synthetic Aperture Radar (SAR) data and optical time-series of data respectively, and then combined with a U-shaped network. The use of time-series of data has been rarely explored in the state of the art for this peculiar objective, and moreover existing models do not combine both spatio-temporal domains and SAR-optical imagery. Quantitative and qualitative results have shown a good ability of the proposed method in producing cloud-free images, by also preserving the details and outperforming the cGAN and the ConvLSTM when individually used. Both the code and the dataset have been implemented from scratch and made available to interested researchers for further analysis and investigation.
3D Multi-Robot Patrolling with a Two-Level Coordination Strategy
Freda, Luigi, Gianni, Mario, Pirri, Fiora, Gawel, Abel, Dube, Renaud, Siegwart, Roland, Cadena, Cesar
Teams of UGVs patrolling harsh and complex 3D environments can experience interference and spatial conflicts with one another. Neglecting the occurrence of these events crucially hinders both soundness and reliability of a patrolling process. This work presents a distributed multi-robot patrolling technique, which uses a two-level coordination strategy to minimize and explicitly manage the occurrence of conflicts and interference. The first level guides the agents to single out exclusive target nodes on a topological map. This target selection relies on a shared idleness representation and a coordination mechanism preventing topological conflicts. The second level hosts coordination strategies based on a metric representation of space and is supported by a 3D SLAM system. Here, each robot path planner negotiates spatial conflicts by applying a multi-robot traversability function. Continuous interactions between these two levels ensure coordination and conflicts resolution. Both simulations and real-world experiments are presented to validate the performances of the proposed patrolling strategy in 3D environments. Results show this is a promising solution for managing spatial conflicts and preventing deadlocks.
Deep execution monitor for robot assistive tasks
Mauro, Lorenzo, Alati, Edoardo, Sanzari, Marta, Ntouskos, Valsamis, Massimiani, Gianluca, Pirri, Fiora
We consider a novel approach to high-level robot task execution for a robot assistive task. In this work we explore the problem of learning to predict the next subtask by introducing a deep model for both sequencing goals and for visually evaluating the state of a task. We show that deep learning for monitoring robot tasks execution very well supports the interconnection between task-level planning and robot operations. These solutions can also cope with the natural non-determinism of the execution monitor. We show that a deep execution monitor leverages robot performance. We measure the improvement taking into account some robot helping tasks performed at a warehouse.
Visual search and recognition for robot task execution and monitoring
Mauro, Lorenzo, Puja, Francesco, Grazioso, Simone, Ntouskos, Valsamis, Sanzari, Marta, Alati, Edoardo, Pirri, Fiora
Visual search of relevant targets in the environment is a crucial robot skill. We propose a preliminary framework for the execution monitor of a robot task, taking care of the robot attitude to visually searching the environment for targets involved in the task. Visual search is also relevant to recover from a failure. The framework exploits deep reinforcement learning to acquire a "common sense" scene structure and it takes advantage of a deep convolutional network to detect objects and relevant relations holding between them. The framework builds on these methods to introduce a vision-based execution monitoring, which uses classical planning as a backbone for task execution. Experiments show that with the proposed vision-based execution monitor the robot can complete simple tasks and can recover from failures in autonomy.
A Hybrid Approach for Trajectory Control Design
Freda, Luigi, Gianni, Mario, Pirri, Fiora
Abstract-- This work presents a methodology to design trajectory tracking feedback control laws, which embed nonparametric statistical models, such as Gaussian Processes (GPs). The aim is to minimize unmodeled dynamics such as undesired slippages. The proposed approach has the benefit of avoiding complex terramechanics analysis to directly estimate from data the robot dynamics on a wide class of trajectories. Experiments in both real and simulated environments prove that the proposed methodology is promising. In the last decades, an increasing interest has been devoted to the design of high performance path tracking. In the literature, three main approaches to face this problem have emerged: (i) model-based and adaptive control [1]-[5]; (ii) Gaussian Processes or stochastic nonlinear models for reinforcement learning of control policies [6], [7], and (iii) nominal models and data-driven estimation of the residual [8], [9].
Discovery and recognition of motion primitives in human activities
Sanzari, Marta, Ntouskos, Valsamis, Pirri, Fiora
We present a novel framework for the automatic discovery and recognition of motion primitives in videos of human activities. Given the 3D pose of a human in a video, human motion primitives are discovered by optimizing the `motion flux', a quantity which captures the motion variation of a group of skeletal joints. A normalization of the primitives is proposed in order to make them invariant with respect to a subject anatomical variations and data sampling rate. The discovered primitives are unknown and unlabeled and are unsupervisedly collected into classes via a hierarchical non-parametric Bayes mixture model. Once classes are determined and labeled they are further analyzed for establishing models for recognizing discovered primitives. Each primitive model is defined by a set of learned parameters. Given new video data and given the estimated pose of the subject appearing on the video, the motion is segmented into primitives, which are recognized with a probability given according to the parameters of the learned models. Using our framework we build a publicly available dataset of human motion primitives, using sequences taken from well-known motion capture datasets. We expect that our framework, by providing an objective way for discovering and categorizing human motion, will be a useful tool in numerous research fields including video analysis, human inspired motion generation, learning by demonstration, intuitive human-robot interaction, and human behavior analysis.
Reports of the AAAI 2011 Fall Symposia
Blisard, Sam (Naval Research Laboratory) | Carmichael, Ted (University of North Carolina at Charlotte) | Ding, Li (University of Maryland, Baltimore County) | Finin, Tim (University of Maryland, Baltimore County) | Frost, Wende (Naval Research Laboratory) | Graesser, Arthur (University of Memphis) | Hadzikadic, Mirsad (University of North Carolina at Charlotte) | Kagal, Lalana (Massachusetts Institute of Technology) | Kruijff, Geert-Jan M. (German Research Center for Artificial Intelligence) | Langley, Pat (Arizona State University) | Lester, James (North Carolina State University) | McGuinness, Deborah L. (Rensselaer Polytechnic Institute) | Mostow, Jack (Carnegie Mellon University) | Papadakis, Panagiotis (University of Sapienza, Rome) | Pirri, Fiora (Sapienza University of Rome) | Prasad, Rashmi (University of Wisconsin-Milwaukee) | Stoyanchev, Svetlana (Columbia University) | Varakantham, Pradeep (Singapore Management University)
The Association for the Advancement of Artificial Intelligence was pleased to present the 2011 Fall Symposium Series, held Friday through Sunday, November 4–6, at the Westin Arlington Gateway in Arlington, Virginia. The titles of the seven symposia are as follows: (1) Advances in Cognitive Systems; (2) Building Representations of Common Ground with Intelligent Agents; (3) Complex Adaptive Systems: Energy, Information and Intelligence; (4) Multiagent Coordination under Uncertainty; (5) Open Government Knowledge: AI Opportunities and Challenges; (6) Question Generation; and (7) Robot-Human Teamwork in Dynamic Adverse Environment. The highlights of each symposium are presented in this report.
Reports of the AAAI 2011 Fall Symposia
Blisard, Sam (Naval Research Laboratory) | Carmichael, Ted (University of North Carolina at Charlotte) | Ding, Li (University of Maryland, Baltimore County) | Finin, Tim (University of Maryland, Baltimore County) | Frost, Wende (Naval Research Laboratory) | Graesser, Arthur (University of Memphis) | Hadzikadic, Mirsad (University of North Carolina at Charlotte) | Kagal, Lalana (Massachusetts Institute of Technology) | Kruijff, Geert-Jan M. (German Research Center for Artificial Intelligence) | Langley, Pat (Arizona State University) | Lester, James (North Carolina State University) | McGuinness, Deborah L. (Rensselaer Polytechnic Institute) | Mostow, Jack (Carnegie Mellon University) | Papadakis, Panagiotis (University of Sapienza, Rome) | Pirri, Fiora (Sapienza University of Rome) | Prasad, Rashmi (University of Wisconsin-Milwaukee) | Stoyanchev, Svetlana (Columbia University) | Varakantham, Pradeep (Singapore Management University)
The Association for the Advancement of Artificial Intelligence was pleased to present the 2011 Fall Symposium Series, held Friday through Sunday, November 4–6, at the Westin Arlington Gateway in Arlington, Virginia. The titles of the seven symposia are as follows: (1) Advances in Cognitive Systems; (2) Building Representations of Common Ground with Intelligent Agents; (3) Complex Adaptive Systems: Energy, Information and Intelligence; (4) Multiagent Coordination under Uncertainty; (5) Open Government Knowledge: AI Opportunities and Challenges; (6) Question Generation; and (7) Robot-Human Teamwork in Dynamic Adverse Environment. The highlights of each symposium are presented in this report.
Designing Intelligent Robots for Human-Robot Teaming in Urban Search and Rescue
Kruijff, Geert-Jan M. (DFKI GmbH) | Colas, Francis (ETH Zurich) | Svoboda, Tomas (Czech Technical University) | Diggelen, Jurriaan van (TNO) | Balmer, Patrick (BlueBotics) | Pirri, Fiora (University La Sapienza) | Worst, Rainer (Fraunhofer IAIS)
The paper describes ongoing integrated research on designing intelligent robots that can assist humans in making a situation assessment during Urban Search & Rescue (USAR) missions. These robots (rover, microcopter) are deployed during the early phases of an emergency response. The aim is to explore those areas of the disaster hotzone which are too dangerous or too difficult for a human to enter at that point. This requires the robots to be "intelligent" in the sense of being capable of various degrees of autonomy in acting and perceiving in the environment. At the same time, their intelligence needs to go beyond mere task-work. Robots and humans are interdependent. Human operators are dependent on these robots to provide information for a situation assessment. And robots are dependent on humans to help them operate (shared control) and perceive (shared assessment) in what are typically highly dynamic, largely unknown environments. Robots and humans need to form a team. The paper describes how various insights from robotics and Artificial Intelligence are combined, to develop new approaches for modeling human robot teaming. These approaches range from new forms of modeling situation awareness (to model distributed acting in dynamic space), human robot interaction (to model communication in teams), flexible planning (to model team coordination and joint action), and cognitive system design (to integrate different forms of functionality in a single system).