Evaluating Voice Command Pipelines for Drone Control: From STT and LLM to Direct Classification and Siamese Networks

Simões, Lucca Emmanuel Pineli, Rodrigues, Lucas Brandão, Silva, Rafaela Mota, da Silva, Gustavo Rodrigues

Jul-10-2024–arXiv.org Artificial Intelligence

The integration of automation and voice control in drone systems has received significant attention in recent research, driven by the need for more intuitive and efficient human-machine interaction [4, 1]. This project focuses on developing a voice command system for the Tello drone, utilizing speech recognition and deep learning models to translate voice commands into precise drone actions. The primary challenge addressed by this project is the accurate and efficient translation of voice commands into specific drone operations. This is particularly crucial in scenarios where traditional control interfaces are impractical or where operators require hands-free operation [10, 5]. To address this challenge, we developed and evaluated three distinct pipelines. The first pipeline uses a traditional Speech-to-Text (STT) model followed by a Large Language Model (LLM) for command interpretation [11]. The second pipeline involves a direct mapping model that predicts drone commands from audio inputs without intermediate text conversion. The third pipeline employs a Siamese neural network to generalize new commands by comparing audio inputs to pre-trained examples [8]. Each pipeline was designed to balance performance, flexibility, and ease of maintenance.

inference time, pipeline, voice command, (15 more...)

arXiv.org Artificial Intelligence

Jul-10-2024

arXiv.org PDF

Add feedback

Country:
- South America > Brazil > Goiás > Goiânia (0.05)

Genre:
- Research Report (0.64)

Industry:
- Information Technology > Robotics & Automation (0.49)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)
  - Robots > Autonomous Vehicles
    - Drones (1.00)
  - Speech > Speech Recognition (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found