MorphoNavi: Aerial-Ground Robot Navigation with Object Oriented Mapping in Digital Twin

Karaf, Sausar, Martynov, Mikhail, Sautenkov, Oleg, Darush, Zhanibek, Tsetserukou, Dzmitry

Apr-24-2025–arXiv.org Artificial Intelligence

-- This paper presents a novel mapping approach for a universal aerial-ground robotic system utilizing a single monocular camera. The proposed system is capable of detecting a diverse range of objects and estimating their positions without requiring fine-tuning for specific environments. The system's performance was evaluated through a simulated search-and-rescue scenario, where the MorphoGear robot successfully located a robotic dog while an operator monitored the process. This work contributes to the development of intelligent, mul-timodal robotic systems capable of operating in unstructured environments. Robotics has experienced rapid advancements in recent years, with Vision-Language Models (VLMs) emerging as a powerful tool for mission execution based on RGB images. Since VLMs require only an image and a text prompt as input, they eliminate the need for expensive and specialized sensors such as LiDARs and depth cameras. This simplicity and cost-effectiveness suggest that vision-language-based control will play a crucial role in the future of robotics, with cameras becoming the primary sensor for most robotic systems. In this paper, we introduce a novel mapping approach designed for a universal air-ground robotic system using a single monocular camera.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Apr-24-2025

arXiv.org PDF

Add feedback

Country:
- Asia
  - Malaysia > Sarawak
    - Kuching (0.04)
  - Russia (0.04)
- Europe
  - France > Île-de-France
    - Paris > Paris (0.04)
  - Russia > Central Federal District
    - Moscow Oblast > Moscow (0.04)
- North America > United States (0.05)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.69)
  - Natural Language > Large Language Model (0.70)
  - Robots (1.00)