Goto

Collaborating Authors

 Drones


Track Anything Rapter(TAR)

arXiv.org Artificial Intelligence

Object tracking is a fundamental task in computer vision with broad practical applications across various domains, including traffic monitoring, robotics, and autonomous vehicle tracking. In this project, we aim to develop a sophisticated aerial vehicle system known as Track Anything Rapter (TAR), designed to detect, segment, and track objects of interest based on user-provided multimodal queries, such as text, images, and clicks. TAR utilizes cutting-edge pre-trained models like DINO, CLIP, and SAM to estimate the relative pose of the queried object. The tracking problem is approached as a Visual Servoing task, enabling the UAV to consistently focus on the object through advanced motion planning and control algorithms. We showcase how the integration of these foundational models with a custom high-level control algorithm results in a highly stable and precise tracking system deployed on a custom-built PX4 Autopilot-enabled Voxl2 M500 drone. To validate the tracking algorithm's performance, we compare it against Vicon-based ground truth. Additionally, we evaluate the reliability of the foundational models in aiding tracking in scenarios involving occlusions. Finally, we test and validate the model's ability to work seamlessly with multiple modalities, such as click, bounding box, and image templates.


Model Agnostic Defense against Adversarial Patch Attacks on Object Detection in Unmanned Aerial Vehicles

arXiv.org Artificial Intelligence

Object detection forms a key component in Unmanned Aerial Vehicles (UAVs) for completing high-level tasks that depend on the awareness of objects on the ground from an aerial perspective. In that scenario, adversarial patch attacks on an onboard object detector can severely impair the performance of upstream tasks. This paper proposes a novel model-agnostic defense mechanism against the threat of adversarial patch attacks in the context of UAV-based object detection. We formulate adversarial patch defense as an occlusion removal task. The proposed defense method can neutralize adversarial patches located on objects of interest, without exposure to adversarial patches during training. Our lightweight single-stage defense approach allows us to maintain a model-agnostic nature, that once deployed does not require to be updated in response to changes in the object detection pipeline. The evaluations in digital and physical domains show the feasibility of our method for deployment in UAV object detection pipelines, by significantly decreasing the Attack Success Ratio without incurring significant processing costs. As a result, the proposed defense solution can improve the reliability of object detection for UAVs.


Facing Global Outrage, Netanyahu Calls Civilian Deaths in Rafah Strike 'Tragic Accident'

NYT > Middle East

Hamas, in a statement, described the Israeli strike on Rafah as "a horrific war crime" and demanded the "immediate and urgent implementation" of the World Court's decision. The group did not refer to the Israeli military's assertions that two Hamas officials had been killed in the strike. The Israeli military said it had taken a number of steps before the strike to reduce the risk of harm to civilians, including conducting aerial surveillance and using munitions characterized as precise. "Based on these measures, it was assessed that there would be no expected harm to uninvolved civilians," it said. But an Israeli official, speaking on the condition of anonymity to discuss a sensitive matter, said on Monday that an initial investigation by the military had concluded that the strike, or shrapnel from it, may have unexpectedly ignited a flammable substance at the camp.


Cooperative Relative Localization in MAV Swarms with Ultra-wideband Ranging

arXiv.org Artificial Intelligence

Relative localization (RL) is essential for the successful operation of micro air vehicle (MAV) swarms. Achieving accurate 3-D RL in infrastructure-free and GPS-denied environments with only distance information is a challenging problem that has not been satisfactorily solved. In this work, based on the range-based peer-to-peer RL using the ultra-wideband (UWB) ranging technique, we develop a novel UWB-based cooperative relative localization (CRL) solution that integrates the relative motion dynamics of each host-neighbor pair to build a unified dynamic model and takes the distances between the neighbors as \textit{bonus information}. Observability analysis using differential geometry shows that the proposed CRL scheme can expand the observable subspace compared to other alternatives using only direct distances between the host agent and its neighbors. In addition, we apply the kernel-induced extended Kalman filter (EKF) to the CRL state estimation problem with the novel-designed Logarithmic-Versoria (LV) kernel to tackle heavy-tailed UWB noise. Sufficient conditions for the convergence of the fixed-point iteration involved in the estimation algorithm are also derived. Comparative Monte Carlo simulations demonstrate that the proposed CRL scheme combined with the LV-kernel EKF significantly improves the estimation accuracy owing to its robustness against both measurement outliers and incorrect measurement covariance matrix initialization. Moreover, with the LV kernel, the estimation is still satisfactory when performing the fixed-point iteration only once for reduced computational complexity.


Survey of Graph Neural Network for Internet of Things and NextG Networks

arXiv.org Artificial Intelligence

The exponential increase in Internet of Things (IoT) devices coupled with 6G pushing towards higher data rates and connected devices has sparked a surge in data. Consequently, harnessing the full potential of data-driven machine learning has become one of the important thrusts. In addition to the advancement in wireless technology, it is important to efficiently use the resources available and meet the users' requirements. Graph Neural Networks (GNNs) have emerged as a promising paradigm for effectively modeling and extracting insights which inherently exhibit complex network structures due to its high performance and accuracy, scalability, adaptability, and resource efficiency. There is a lack of a comprehensive survey that focuses on the applications and advances GNN has made in the context of IoT and Next Generation (NextG) networks. To bridge that gap, this survey starts by providing a detailed description of GNN's terminologies, architecture, and the different types of GNNs. Then we provide a comprehensive survey of the advancements in applying GNNs for IoT from the perspective of data fusion and intrusion detection. Thereafter, we survey the impact GNN has made in improving spectrum awareness. Next, we provide a detailed account of how GNN has been leveraged for networking and tactical systems. Through this survey, we aim to provide a comprehensive resource for researchers to learn more about GNN in the context of wireless networks, and understand its state-of-the-art use cases while contrasting to other machine learning approaches. Finally, we also discussed the challenges and wide range of future research directions to further motivate the use of GNN for IoT and NextG Networks.


Clustering-based Learning for UAV Tracking and Pose Estimation

arXiv.org Artificial Intelligence

UAV tracking and pose estimation plays an imperative role in various UAV-related missions, such as formation control and anti-UAV measures. Accurately detecting and tracking UAVs in a 3D space remains a particularly challenging problem, as it requires extracting sparse features of micro UAVs from different flight environments and continuously matching correspondences, especially during agile flight. Generally, cameras and LiDARs are the two main types of sensors used to capture UAV trajectories in flight. However, both sensors have limitations in UAV classification and pose estimation. This technical report briefly introduces the method proposed by our team "NTU-ICG" for the CVPR 2024 UG2+ Challenge Track 5. This work develops a clustering-based learning detection approach, CL-Det, for UAV tracking and pose estimation using two types of LiDARs, namely Livox Avia and LiDAR 360. We combine the information from the two data sources to locate drones in 3D. We first align the timestamps of Livox Avia data and LiDAR 360 data and then separate the point cloud of objects of interest (OOIs) from the environment. The point cloud of OOIs is clustered using the DBSCAN method, with the midpoint of the largest cluster assumed to be the UAV position. Furthermore, we utilize historical estimations to fill in missing data. The proposed method shows competitive pose estimation performance and ranks 5th on the final leaderboard of the CVPR 2024 UG2+ Challenge.


Multi-Modal UAV Detection, Classification and Tracking Algorithm -- Technical Report for CVPR 2024 UG2 Challenge

arXiv.org Artificial Intelligence

This technical report presents the 1st winning model for UG2+, a task in CVPR 2024 UAV Tracking and Pose-Estimation Challenge. This challenge faces difficulties in drone detection, UAV-type classification and 2D/3D trajectory estimation in extreme weather conditions with multi-modal sensor information, including stereo vision, various Lidars, Radars, and audio arrays. Leveraging this information, we propose a multi-modal UAV detection, classification, and 3D tracking method for accurate UAV classification and tracking. A novel classification pipeline which incorporates sequence fusion, region of interest (ROI) cropping, and keyframe selection is proposed. Our system integrates cutting-edge classification techniques and sophisticated post-processing steps to boost accuracy and robustness. The designed pose estimation pipeline incorporates three modules: dynamic points analysis, a multi-object tracker, and trajectory completion techniques. Extensive experiments have validated the effectiveness and precision of our approach. In addition, we also propose a novel dataset pre-processing method and conduct a comprehensive ablation study for our design. We finally achieved the best performance in the classification and tracking of the MMUAD dataset. The code and configuration of our method are available at https://github.com/dtc111111/Multi-Modal-UAV.


Agile Robotics: Optimal Control, Reinforcement Learning, and Differentiable Simulation

arXiv.org Artificial Intelligence

Control systems are at the core of every real-world robot. They are deployed in an ever-increasing number of applications, ranging from autonomous racing and search-and-rescue missions to industrial inspections and space exploration. To achieve peak performance, certain tasks require pushing the robot to its maximum agility. How can we design control algorithms that enhance the agility of autonomous robots and maintain robustness against unforeseen disturbances? This paper addresses this question by leveraging fundamental principles in optimal control, reinforcement learning, and differentiable simulation.


$\textit{UniSaT}$: Unified-Objective Belief Model and Planner to Search for and Track Multiple Objects

arXiv.org Artificial Intelligence

The problem of path planning for autonomously searching and tracking multiple objects is important to reconnaissance, surveillance, and many other data-gathering applications. Due to the inherent competing objectives of searching for new objects while maintaining tracks for found objects, most current approaches rely on multi-objective planning methods, leaving it up to the user to tune parameters to balance between the two objectives, usually based on heuristics or trial and error. In this paper, we introduce $\textit{UniSaT}$ ($\textit{Unified Search and Track}$), a unified-objective formulation for the search and track problem based on Random Finite Sets (RFS). This is done by modeling both the unknown and known objects through a combined generalized labeled multi-Bernoulli (GLMB) filter. For the unseen objects, we can leverage both cardinality and spatial prior distributions, which means $\textit{UniSaT}$ does not rely on knowing the exact count of the expected number of objects in the space. The planner maximizes the mutual information of this unified belief model, creating balanced search and tracking behaviors. We demonstrate our work in a simulated environment and show both qualitative results as well as quantitative improvements over a multi-objective method.


Will these drones 'revolutionize' 911 response? L.A. suburb will be first to test

Los Angeles Times

A black-and-white drone about the size of a sofa cushion took off with a gentle whir at the Hawthorne Police Department earlier this month, hovering and darting back and forth a few times before landing on a podium to a round of applause. A small audience and local TV news crews had gathered to see the unveiling of "Responder," marketed as the first drone built specifically to respond to 911 calls by quickly arriving at scenes, beaming a live video feed and, if necessary, dropping off medical supplies. The company behind the new drone, Seattle-based Brinc -- a tech startup with a 24-year-old chief executive -- has boasted it will "revolutionize the public safety landscape." But law enforcement agencies across Southern California and the country already employ drones for a variety of purposes, including 911 response, and skeptics warn about the risk of "mission creep" when the technology is weaponized or used for surveillance. Some Los Angeles activists have fought to limit police drone use, but Hawthorne's adoption of Brinc's Responder is a sign some local authorities are continuing to embrace unmanned aerial vehicles despite the pushback and price tag.