throttle
Towards Task-Oriented Flying: Framework, Infrastructure, and Principles
Huang, Kangyao, Wang, Hao, Chen, Jingyu, Chen, Jintao, Luo, Yu, Guo, Di, Zhang, Xiangkui, Ji, Xiangyang, Liu, Huaping
Deploying robot learning methods to aerial robots in unstructured environments remains both challenging and promising. While recent advances in deep reinforcement learning (DRL) have enabled end-to-end flight control, the field still lacks systematic design guidelines and a unified infrastructure to support reproducible training and real-world deployment. We present a task-oriented framework for end-to-end DRL in quadrotors that integrates design principles for complex task specification and reveals the interdependencies among simulated task definition, training design principles, and physical deployment. Our framework involves software infrastructure, hardware platforms, and open-source firmware to support a full-stack learning infrastructure and workflow. Extensive empirical results demonstrate robust flight and sim-to-real generalization under real-world disturbances. By reducing the entry barrier for deploying learning-based controllers on aerial robots, our work lays a practical foundation for advancing autonomous flight in dynamic and unstructured environments.
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Asia > China > Liaoning Province > Dalian (0.04)
- Asia > China > Beijing > Beijing (0.04)
DroneAudioset: An Audio Dataset for Drone-based Search and Rescue
Gupta, Chitralekha, Ramesh, Soundarya, Sasikumar, Praveen, Yeo, Kian Peen, Nanayakkara, Suranga
Unmanned Aerial Vehicles (UAVs) or drones, are increasingly used in search and rescue missions to detect human presence. Existing systems primarily leverage vision-based methods which are prone to fail under low-visibility or occlusion. Drone-based audio perception offers promise but suffers from extreme ego-noise that masks sounds indicating human presence. Existing datasets are either limited in diversity or synthetic, lacking real acoustic interactions, and there are no standardized setups for drone audition. To this end, we present DroneAudioset (The dataset is publicly available at https://huggingface.co/datasets/ahlab-drone-project/DroneAudioSet/ under the MIT license), a comprehensive drone audition dataset featuring 23.5 hours of annotated recordings, covering a wide range of signal-to-noise ratios (SNRs) from -57.2 dB to -2.5 dB, across various drone types, throttles, microphone configurations as well as environments. The dataset enables development and systematic evaluation of noise suppression and classification methods for human-presence detection under challenging conditions, while also informing practical design considerations for drone audition systems, such as microphone placement trade-offs, and development of drone noise-aware audio processing. This dataset is an important step towards enabling design and deployment of drone-audition systems.
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)
- Health & Medicine (0.87)
See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation
Hu, Chih Yao, Lin, Yang-Sen, Lee, Yuna, Su, Chih-Hai, Lee, Jie-Ying, Tsai, Shr-Ruei, Lin, Chin-Yang, Chen, Kuan-Wen, Ke, Tsung-Wei, Liu, Yu-Lun
We present See, Point, Fly (SPF), a training-free aerial vision-and-language navigation (AVLN) framework built atop vision-language models (VLMs). SPF is capable of navigating to any goal based on any type of free-form instructions in any kind of environment. In contrast to existing VLM-based approaches that treat action prediction as a text generation task, our key insight is to consider action prediction for AVLN as a 2D spatial grounding task. SPF harnesses VLMs to decompose vague language instructions into iterative annotation of 2D waypoints on the input image. Along with the predicted traveling distance, SPF transforms predicted 2D waypoints into 3D displacement vectors as action commands for UAVs. Moreover, SPF also adaptively adjusts the traveling distance to facilitate more efficient navigation. Notably, SPF performs navigation in a closed-loop control manner, enabling UAVs to follow dynamic targets in dynamic environments. SPF sets a new state of the art in DRL simulation benchmark, outperforming the previous best method by an absolute margin of 63%. In extensive real-world evaluations, SPF outperforms strong baselines by a large margin. We also conduct comprehensive ablation studies to highlight the effectiveness of our design choice. Lastly, SPF shows remarkable generalization to different VLMs. Project page: https://spf-web.pages.dev
- Information Technology > Robotics & Automation (0.68)
- Transportation > Air (0.46)
Minimalistic Autonomous Stack for High-Speed Time-Trial Racing
Ali, Mahmoud, Jardali, Hassan, Yu, Youwei, Pushp, Durgakant, Liu, Lantao
Autonomous racing has seen significant advancements, driven by competitions such as the Indy Autonomous Challenge (IAC) and the Abu Dhabi Autonomous Racing League (A2RL). However, developing an autonomous racing stack for a full-scale car is often constrained by limited access to dedicated test tracks, restricting opportunities for real-world validation. While previous work typically requires extended development cycles and significant track time, this paper introduces a minimalistic autonomous racing stack for high-speed time-trial racing that emphasizes rapid deployment and efficient system integration with minimal on-track testing. The proposed stack was validated on real speedways, achieving a top speed of 206 km/h within just 11 hours' practice run on the track with 325 km in total. Additionally, we present the system performance analysis, including tracking accuracy, vehicle dynamics, and safety considerations, offering insights for teams seeking to rapidly develop and deploy an autonomous racing stack with limited track access.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.24)
- North America > United States > Indiana > Monroe County > Bloomington (0.04)
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
SimCoachCorpus: A naturalistic dataset with language and trajectories for embodied teaching
Sumner, Emily, Gopinath, Deepak E., Dees, Laporsha, Gomez, Patricio Reyes, Cui, Xiongyi, Silva, Andrew, Costa, Jean, Morgan, Allison, Schrum, Mariah, Chen, Tiffany L., Balachandran, Avinash, Rosman, Guy
Curated datasets are essential for training and evaluating AI approaches, but are often lacking in domains where language and physical action are deeply intertwined. In particular, few datasets capture how people acquire embodied skills through verbal instruction over time. To address this gap, we introduce SimCoachCorpus: a unique dataset of race car simulator driving that allows for the investigation of rich interactive phenomena during guided and unguided motor skill acquisition. In this dataset, 29 humans were asked to drive in a simulator around a race track for approximately ninety minutes. Fifteen participants were given personalized one-on-one instruction from a professional performance driving coach, and 14 participants drove without coaching. \name\ includes embodied features such as vehicle state and inputs, map (track boundaries and raceline), and cone landmarks. These are synchronized with concurrent verbal coaching from a professional coach and additional feedback at the end of each lap. We further provide annotations of coaching categories for each concurrent feedback utterance, ratings on students' compliance with coaching advice, and self-reported cognitive load and emotional state of participants (gathered from surveys during the study). The dataset includes over 20,000 concurrent feedback utterances, over 400 terminal feedback utterances, and over 40 hours of vehicle driving data. Our naturalistic dataset can be used for investigating motor learning dynamics, exploring linguistic phenomena, and training computational models of teaching. We demonstrate applications of this dataset for in-context learning, imitation learning, and topic modeling. The dataset introduced in this work will be released publicly upon publication of the peer-reviewed version of this paper. Researchers interested in early access may register at https://tinyurl.com/SimCoachCorpusForm.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Los Altos (0.04)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Leisure & Entertainment > Sports > Motorsports (1.00)
- Education > Educational Setting (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Large Language Models as Autonomous Spacecraft Operators in Kerbal Space Program
Carrasco, Alejandro, Rodriguez-Fernandez, Victor, Linares, Richard
Recent trends are emerging in the use of Large Language Models (LLMs) as autonomous agents that take actions based on the content of the user text prompts. We intend to apply these concepts to the field of Control in space, enabling LLMs to play a significant role in the decision-making process for autonomous satellite operations. As a first step towards this goal, we have developed a pure LLM-based solution for the Kerbal Space Program Differential Games (KSPDG) challenge, a public software design competition where participants create autonomous agents for maneuvering satellites involved in non-cooperative space operations, running on the KSP game engine. Our approach leverages prompt engineering, few-shot prompting, and fine-tuning techniques to create an effective LLM-based agent that ranked 2nd in the competition. To the best of our knowledge, this work pioneers the integration of LLM agents into space research. The project comprises several open repositories to facilitate replication and further research. The codebase is accessible on \href{https://github.com/ARCLab-MIT/kspdg}{GitHub}, while the trained models and datasets are available on \href{https://huggingface.co/OhhTuRnz}{Hugging Face}. Additionally, experiment tracking and detailed results can be reviewed on \href{https://wandb.ai/carrusk/huggingface}{Weights \& Biases
- Europe > Spain > Galicia > Madrid (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Leisure & Entertainment > Games > Computer Games (0.48)
- Information Technology (0.48)
- Government (0.46)
Action Space Reduction Strategies for Reinforcement Learning in Autonomous Driving
Delavari, Elahe, Khanzada, Feeza Khan, Kwon, Jaerock
--Reinforcement Learning (RL) offers a promising framework for autonomous driving by enabling agents to learn control policies through interaction with environments. However, large and high-dimensional action spaces--often used to support fine-grained control--can impede training efficiency and increase exploration costs. In this study, we introduce and evaluate two novel structured action space modification strategies for RL in autonomous driving: dynamic masking and relative action space reduction. These approaches are systematically compared against fixed reduction schemes and full action space baselines to assess their impact on policy learning and performance. Our framework leverages a multimodal Proximal Policy Optimization agent that processes both semantic image sequences and scalar vehicle states. The proposed dynamic and relative strategies incorporate real-time action masking based on context and state transitions, preserving action consistency while eliminating invalid or subop-timal choices. Through comprehensive experiments across diverse driving routes, we show that action space reduction significantly improves training stability and policy performance. The dynamic and relative schemes, in particular, achieve a favorable balance between learning speed, control precision, and generalization. The development of Autonomous V ehicles (A Vs) has accelerated in recent years, offering the potential to improve road safety, reduce traffic congestion, and enhance mobility. However, building reliable and efficient self-driving systems remains a formidable challenge due to the complexity of real-world driving. These environments involve dynamic interactions with multiple agents, unpredictable traffic behaviors, and rare but critical edge cases that demand robust decision-making.
- North America > United States > Michigan > Wayne County > Dearborn (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
A Small-Scale Robot for Autonomous Driving: Design, Challenges, and Best Practices
Maghsoumi, Hossein, Fallah, Yaser
--Small-scale autonomous vehicle platforms provide a cost-effective environment for developing and testing advanced driving systems. However, specific configurations within this scale are underrepresented, limiting full awareness of their potential. This paper focuses on a one-sixth-scale setup, offering a high-level overview of its design, hardware and software integration, and typical challenges encountered during development. We discuss methods for addressing mechanical and electronic issues common to this scale and propose guidelines for improving reliability and performance. By sharing these insights, we aim to expand the utility of small-scale vehicles for testing autonomous driving algorithms and to encourage further research in this domain.
- North America > United States > Florida > Orange County > Orlando (0.04)
- Asia > Middle East > Israel (0.04)
- Automobiles & Trucks (1.00)
- Transportation > Ground > Road (0.63)
- Information Technology > Robotics & Automation (0.63)
The 2025 Chevrolet Corvette ZR1 is a stunning piece of engineering
Breakthroughs, discoveries, and DIY tips sent every weekday. At 95 degrees, the heat rising off the track at the Circuit of the Americas in Austin, Texas, makes it impossible to see the 40-mph left turn at the end of the 170-mph straight before you need to brake for the turn. This makes every lap a leap of faith of sorts as you brake at the appointed spot and pray to Brembo, the patron saint of deceleration, that you'll slow in time to make the turn you know is coming but cannot see clearly through shimmering heat waves. The Brembo-supplied carbon ceramic brakes feature six-piston monobloc front calipers gripping 15.7-inch rotors and four-piston monobloc rear calipers squeezing 15.4-inch rotors. Pounding around COTA for lap after lap, the brakes continue to deliver, with no fade or hair-raising long pedal as exhibited by the Aston Martin Vantage during last year's track test.
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks > Manufacturer (1.00)
- Leisure & Entertainment > Sports > Motorsports > Formula One (0.35)
MOMAV: A highly symmetrical fully-actuated multirotor drone using optimizing control allocation
MOMAV (Marco's Omnidirectional Micro Aerial Vehicle) is a multirotor drone that is fully actuated, meaning it can control its orientation independently of its position. MOMAV is also highly symmetrical, making its flight efficiency largely unaffected by its current orientation. These characteristics are achieved by a novel drone design where six rotor arms align with the vertices of an octahedron, and where each arm can actively rotate along its long axis. Various standout features of MOMAV are presented: The high flight efficiency compared to arm configuration of other fully-actuated drones, the design of an original rotating arm assembly featuring slip-rings used to enable continuous arm rotation, and a novel control allocation algorithm based on sequential quadratic programming (SQP) used to calculate throttle and arm-angle setpoints in flight. Flight tests have shown that MOMAV is able to achieve remarkably low mean position/orientation errors of 6.6mm, 2.1° (σ: 3.0mm, 1.0°) when sweeping position setpoints, and 11.8mm, 3.3° (σ: 8.6mm, 2.0°) when sweeping orientation setpoints.
- North America > United States (0.04)
- Europe > Switzerland (0.04)
- Transportation > Air (1.00)
- Aerospace & Defense > Aircraft (0.89)