Gbagbe, Koffivi Fidèle
Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning
Lykov, Artem, Cabrera, Miguel Altamirano, Gbagbe, Koffivi Fidèle, Tsetserukou, Dzmitry
This paper presents the development of a novel ethical reasoning framework for robots. "Robots Can Feel" is the first system for robots that utilizes a combination of logic and human-like emotion simulation to make decisions in morally complex situations akin to humans. The key feature of the approach is the management of the Emotion Weight Coefficient - a customizable parameter to assign the role of emotions in robot decision-making. The system aims to serve as a tool that can equip robots of any form and purpose with ethical behavior close to human standards. Besides the platform, the system is independent of the choice of the base model. During the evaluation, the system was tested on 8 top up-to-date LLMs (Large Language Models). This list included both commercial and open-source models developed by various companies and countries. The research demonstrated that regardless of the model choice, the Emotions Weight Coefficient influences the robot's decision similarly. According to ANOVA analysis, the use of different Emotion Weight Coefficients influenced the final decision in a range of situations, such as in a request for a dietary violation F(4, 35) = 11.2, p = 0.0001 and in an animal compassion situation F(4, 35) = 8.5441, p = 0.0001. A demonstration code repository is provided at: https://github.com/TemaLykov/robots_can_feel
Bi-VLA: Vision-Language-Action Model-Based System for Bimanual Robotic Dexterous Manipulations
Gbagbe, Koffivi Fidèle, Cabrera, Miguel Altamirano, Alabbas, Ali, Alyunes, Oussama, Lykov, Artem, Tsetserukou, Dzmitry
Abstract-- This research introduces the Bi-VLA (Vision-Language-Action) model, a novel system designed for bimanual robotic dexterous manipulation that seamlessly integrates vision for scene understanding, language comprehension for translating human instructions into executable code, and physical action generation. We evaluated the system's functionality through a series of household tasks, including the preparation of a desired salad upon human request. Bi-VLA demonstrates the ability to interpret complex human instructions, perceive and understand the visual context of ingredients, and execute precise bimanual actions to prepare the requested salad. We assessed the system's performance in terms of accuracy, efficiency, and adaptability to different salad recipes and human preferences through a series of experiments. Our results show a 100% success rate in generating the correct executable code by the Language Module, a 96.06% success rate in detecting specific ingredients by the Vision Module, and an overall success rate of 83.4% in However, despite their potential, the application of language models Recent advancements in language models have significantly to synthesize the bimanual skills of robots has not received impacted Human-Robot Interaction (HRI), enabling significant attention.
CognitiveOS: Large Multimodal Model based System to Endow Any Type of Robot with Generative AI
Lykov, Artem, Konenkov, Mikhail, Gbagbe, Koffivi Fidèle, Litvinov, Mikhail, Peter, Robinroy, Davletshin, Denis, Fedoseev, Aleksey, Kobzarev, Oleg, Alabbas, Ali, Alyounes, Oussama, Cabrera, Miguel Altamirano, Tsetserukou, Dzmitry
In cognitive robotics, the scientific community recognized the high generalization capability of these large models as a key to developing a robot that could perform new tasks based on generalized knowledge derived from familiar actions expressed in natural language. However, efforts to apply LLMs in robotics faced challenges, particularly in understanding and processing the external world. Previous attempts to convey the model's understanding of the world through text-only approaches [1], [20], [8] struggled with ambiguities and the assumption of static objects unless interacted with. The introduction of multi-modal transformer-based models such as GPT-4 [16] and Gemini [18], capable of processing images, opened up new possibilities for robotics [5], allowing robots to comprehend their environment and enhancing their'Embodied Experience' [15]. Cognitive robots have been developed on various platforms, ranging from mobile manipulators [5], [3] to bio-inspired humanoid robots [21] and quadrupedal robots [6]. In the latter, cognitive abilities were developed using an'Inner Monologue' approach [10], with improvements inspired by the'Autogen' concept [25]. The cognition of the robot is facilitated through internal communication between agent models, leveraging their strengths to provide different cognitive capabilities to the system.