AITopics | Myshlyaev, Artyom

Collaborating Authors

Myshlyaev, Artyom

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CognitiveDrone: A VLA Model and Evaluation Benchmark for Real-Time Cognitive Task Solving and Reasoning in UAVs

Lykov, Artem, Serpiva, Valerii, Khan, Muhammad Haris, Sautenkov, Oleg, Myshlyaev, Artyom, Tadevosyan, Grik, Yaqoot, Yasheerah, Tsetserukou, Dzmitry

arXiv.org Artificial IntelligenceMar-3-2025

CognitiveDrone: A VLA Model and Evaluation Benchmark for Real-Time Cognitive T ask Solving and Reasoning in UA Vs Artem Lykov, V alerii Serpiva, Muhammad Haris Khan, Oleg Sautenkov, Artyom Myshlyaev, Grik Tadevosyan, Y asheerah Y aqoot, and Dzmitry Tsetserukou Abstract -- This paper introduces CognitiveDrone, a novel Vision-Language-Action (VLA) model tailored for complex Unmanned Aerial V ehicles (UA Vs) tasks that demand advanced cognitive abilities. Trained on a dataset comprising over 8,000 simulated flight trajectories across three key categories--Human Recognition, Symbol Understanding, and Reasoning--the model generates real-time 4D action commands based on first-person visual inputs and textual instructions. T o further enhance performance in intricate scenarios, we propose CognitiveDrone-R1, which integrates an additional Vision-Language Model (VLM) reasoning module to simplify task directives prior to high-frequency control. Experimental evaluations using our open-source benchmark, CognitiveDroneBench, reveal that while a racing-oriented model (RaceVLA) achieves an overall success rate of 31.3%, the base CognitiveDrone model reaches 59.6%, and CognitiveDrone-R1 attains a success rate of 77.2%. These results demonstrate improvements of up to 30% in critical cognitive tasks, underscoring the effectiveness of incorporating advanced reasoning capabilities into UA V control systems. Our contributions include the development of a state-of-the-art VLA model for UA V control and the introduction of the first dedicated benchmark for assessing cognitive tasks in drone operations.

available, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.01378

Country: Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Robotics & Automation (0.48)
Transportation > Air (0.34)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.67)
(2 more...)

Add feedback

Evolution 6.0: Evolving Robotic Capabilities Through Generative Design

Khan, Muhammad Haris, Myshlyaev, Artyom, Lykov, Artem, Cabrera, Miguel Altamirano, Tsetserukou, Dzmitry

arXiv.org Artificial IntelligenceFeb-25-2025

We propose a new concept, Evolution 6.0, which represents the evolution of robotics driven by Generative AI. When a robot lacks the necessary tools to accomplish a task requested by a human, it autonomously designs the required instruments and learns how to use them to achieve the goal. Evolution 6.0 is an autonomous robotic system powered by Vision-Language Models (VLMs), Vision-Language Action (VLA) models, and Text-to-3D generative models for tool design and task execution. The system comprises two key modules: the Tool Generation Module, which fabricates task-specific tools from visual and textual data, and the Action Generation Module, which converts natural language instructions into robotic actions. It integrates QwenVLM for environmental understanding, OpenVLA for task execution, and Llama-Mesh for 3D tool generation. Evaluation results demonstrate a 90% success rate for tool generation with a 10-second inference time, and action generation achieving 83.5% in physical and visual generalization, 70% in motion generalization, and 37% in semantic generalization. Future improvements will focus on bimanual manipulation, expanded task capabilities, and enhanced environmental interpretation to improve real-world adaptability.

artificial intelligence, evolving robotic capability, natural language, (2 more...)

arXiv.org Artificial Intelligence

2502.17034

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.53)

Add feedback