Moulin-Frier, Clément, Fischer, Tobias, Petit, Maxime, Pointeau, Grégoire, Puigbo, Jordi-Ysard, Pattacini, Ugo, Low, Sock Ching, Camilleri, Daniel, Nguyen, Phuong, Hoffmann, Matej, Chang, Hyung Jin, Zambelli, Martina, Mealier, Anne-Laure, Damianou, Andreas, Metta, Giorgio, Prescott, Tony J., Demiris, Yiannis, Dominey, Peter Ford, Verschure, Paul F. M. J.
This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the-art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users.
In this paper, we argue that the future of Artificial Intelligence research resides in two keywords: integration and embodiment. We support this claim by analyzing the recent advances of the field. Regarding integration, we note that the most impactful recent contributions have been made possible through the integration of recent Machine Learning methods (based in particular on Deep Learning and Recurrent Neural Networks) with more traditional ones (e.g. Monte-Carlo tree search, goal babbling exploration or addressable memory systems). Regarding embodiment, we note that the traditional benchmark tasks (e.g. visual classification or board games) are becoming obsolete as state-of-the-art learning algorithms approach or even surpass human performance in most of them, having recently encouraged the development of first-person 3D game platforms embedding realistic physics. Building upon this analysis, we first propose an embodied cognitive architecture integrating heterogenous sub-fields of Artificial Intelligence into a unified framework. We demonstrate the utility of our approach by showing how major contributions of the field can be expressed within the proposed framework. We then claim that benchmarking environments need to reproduce ecologically-valid conditions for bootstrapping the acquisition of increasingly complex cognitive skills through the concept of a cognitive arms race between embodied agents.
The sample-inefficiency problem in Artificial Intelligence refers to the inability of current Deep Reinforcement Learning models to optimize action policies within a small number of episodes. Recent studies have tried to overcome this limitation by adding memory systems and architectural biases to improve learning speed, such as in Episodic Reinforcement Learning. However, despite achieving incremental improvements, their performance is still not comparable to how humans learn behavioral policies. In this paper, we capitalize on the design principles of the Distributed Adaptive Control (DAC) theory of mind and brain to build a novel cognitive architecture (DAC-ML) that, by incorporating a hippocampus-inspired sequential memory system, can rapidly converge to effective action policies that maximize reward acquisition in a challenging foraging task.
Learning from Demonstration (LfD) constitutes one of the most robust methodologies for constructing efficient cognitive robotic systems. Current key challenges in the field include those of multi-agent learning and long-term autonomy. Towards this direction, a novel cognitive architecture for multi-agent LfD robotic learning is introduced in this paper, targeting to enable the reliable deployment of open, scalable and expandable robotic systems in large-scale and complex environments. In particular, the designed architecture capitalizes on the recent advances in the Artificial Intelligence (AI) (and especially the Deep Learning (DL)) field, by establishing a Federated Learning (FL)-based framework for incarnating a multi-human multi-robot collaborative learning environment. The fundamental conceptualization relies on employing multiple AI-empowered cognitive processes (implementing various robotic tasks) that operate at the edge nodes of a network of robotic platforms, while global AI models (underpinning the aforementioned robotic tasks) are collectively created and shared among the network, by elegantly combining information from a large number of human-robot interaction instances. Pivotal novelties of the designed cognitive architecture include: a) it introduces a new FL-based formalism that extends the conventional LfD learning paradigm to support large-scale multi-agent operational settings, b) it elaborates previous FL-based self-learning robotic schemes so as to incorporate the human in the learning loop, and c) it consolidates the fundamental principles of FL with additional sophisticated AI-enabled learning methodologies for modelling the multi-level inter-dependencies among the robotic tasks. The applicability of the proposed framework is explained using an example of a real-world industrial case study for agile production-based Critical Raw Materials (CRM) recovery.
In order to understand the formation of social conventions we need to know the specific role of control and learning in multi-agent systems. To advance in this direction, we propose, within the framework of the Distributed Adaptive Control (DAC) theory, a novel Control-based Reinforcement Learning architecture (CRL) that can account for the acquisition of social conventions in multi-agent populations that are solving a benchmark social decision-making problem. Our new CRL architecture, as a concrete realization of DAC multi-agent theory, implements a low-level sensorimotor control loop handling the agent's reactive behaviors (pre-wired reflexes), along with a layer based on model-free reinforcement learning that maximizes long-term reward. We apply CRL in a multi-agent game-theoretic task in which coordination must be achieved in order to find an optimal solution. We show that our CRL architecture is able to both find optimal solutions in discrete and continuous time and reproduce human experimental data on standard game-theoretic metrics such as efficiency in acquiring rewards, fairness in reward distribution and stability of convention formation.