CognitiveOS: Large Multimodal Model based System to Endow Any Type of Robot with Generative AI

Lykov, Artem, Konenkov, Mikhail, Gbagbe, Koffivi Fidèle, Litvinov, Mikhail, Peter, Robinroy, Davletshin, Denis, Fedoseev, Aleksey, Kobzarev, Oleg, Alabbas, Ali, Alyounes, Oussama, Cabrera, Miguel Altamirano, Tsetserukou, Dzmitry

Jan-29-2024–arXiv.org Artificial Intelligence

In cognitive robotics, the scientific community recognized the high generalization capability of these large models as a key to developing a robot that could perform new tasks based on generalized knowledge derived from familiar actions expressed in natural language. However, efforts to apply LLMs in robotics faced challenges, particularly in understanding and processing the external world. Previous attempts to convey the model's understanding of the world through text-only approaches [1], [20], [8] struggled with ambiguities and the assumption of static objects unless interacted with. The introduction of multi-modal transformer-based models such as GPT-4 [16] and Gemini [18], capable of processing images, opened up new possibilities for robotics [5], allowing robots to comprehend their environment and enhancing their'Embodied Experience' [15]. Cognitive robots have been developed on various platforms, ranging from mobile manipulators [5], [3] to bio-inspired humanoid robots [21] and quadrupedal robots [6]. In the latter, cognitive abilities were developed using an'Inner Monologue' approach [10], with improvements inspired by the'Autogen' concept [25]. The cognition of the robot is facilitated through internal communication between agent models, leveraging their strengths to provide different cognitive capabilities to the system.

module, platform, robot, (14 more...)

arXiv.org Artificial Intelligence

Jan-29-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Russia (0.04)
- Europe > Russia (0.04)
- North America > United States
  - Colorado > Boulder County
    - Boulder (0.04)
  - Hawaii (0.06)
  - New York > New York County
    - New York City (0.04)

Genre:
- Research Report (0.40)

Industry:
- Health & Medicine > Consumer Health (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (0.50)
  - Natural Language > Large Language Model (1.00)
  - Robots (1.00)