Goto

Collaborating Authors

 target window


LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UAVs

arXiv.org Artificial Intelligence

LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UA Vs Xinyuan Zhang, Y onglin Tian, Fei Lin, Y ue Liu, Jing Ma, Korn elia S ara Szatm ary, Fei-Y ue Wang Abstract --The growing demand for intelligent logistics, particularly fine-grained terminal delivery, underscores the need for autonomous UA V (Unmanned Aerial V ehicle)-based delivery systems. However, most existing last-mile delivery studies rely on ground robots, while current UA V-based Vision-Language Navigation (VLN) tasks primarily focus on coarse-grained, long-range goals, making them unsuitable for precise terminal delivery. T o bridge this gap, we propose LogisticsVLN, a scalable aerial delivery system built on multimodal large language models (MLLMs) for autonomous terminal delivery. LogisticsVLN integrates lightweight Large Language Models (LLMs) and Visual-Language Models (VLMs) in a modular pipeline for request understanding, floor localization, object detection, and action-decision making. T o support research and evaluation in this new setting, we construct the Vision-Language Delivery (VLD) dataset within the CARLA simulator . In addition, we conduct subtask-level evaluations of each module of our system, offering valuable insights for improving the robustness and real-world deployment of foundation model-based vision-language delivery systems. I NTRODUCTION Driven by the rapid growth of e-commerce and urbanization, logistics has become an increasingly critical component of modern society [1]. In particular, there is a growing demand for stable, efficient, and user-centric terminal delivery, This work is partly supported by the Science and Technology Development Fund, Macao SAR (File no. Xinyuan Zhang is with the School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China (e-mail: zhangxinyuan23@mails.ucas.ac.cn).


Abnormality Forecasting: Time Series Anomaly Prediction via Future Context Modeling

arXiv.org Artificial Intelligence

Identifying anomalies from time series data plays an important role in various fields such as infrastructure security, intelligent operation and maintenance, and space exploration. Current research focuses on detecting the anomalies after they occur, which can lead to significant financial/reputation loss or infrastructure damage. In this work we instead study a more practical yet very challenging problem, time series anomaly prediction, aiming at providing early warnings for abnormal events before their occurrence. To tackle this problem, we introduce a novel principled approach, namely future context modeling (FCM). Its key insight is that the future abnormal events in a target window can be accurately predicted if their preceding observation window exhibits any subtle difference to normal data. To effectively capture such differences, FCM first leverages long-term forecasting models to generate a discriminative future context based on the observation data, aiming to amplify those subtle but unusual difference. It then models a normality correlation of the observation data with the forecasting future context to complement the normality modeling of the observation data in foreseeing possible abnormality in the target window. A joint variate-time attention learning is also introduced in FCM to leverage both temporal signals and features of the time series data for more discriminative normality modeling in the aforementioned two views. Comprehensive experiments on five datasets demonstrate that FCM gains good recall rate (70\%+) on multiple datasets and significantly outperforms all baselines in F1 score. Code is available at https://github.com/mala-lab/FCM.


Spreading Factor assisted LoRa Localization with Deep Reinforcement Learning

arXiv.org Artificial Intelligence

Most of the developed localization solutions rely on RSSI fingerprinting. However, in the LoRa networks, due to the spreading factor (SF) in the network setting, traditional fingerprinting may lack representativeness of the radio map, leading to inaccurate position estimates. As such, in this work, we propose a novel LoRa RSSI fingerprinting approach that takes into account the SF. The performance evaluation shows the prominence of our proposed approach since we achieved an improvement in localization accuracy by up to 6.67% compared to the state-of-the-art methods. The evaluation has been done using a fully connected deep neural network (DNN) set as the baseline. To further improve the localization accuracy, we propose a deep reinforcement learning model that captures the ever-growing complexity of LoRa networks and copes with their scalability. The obtained results show an improvement of 48.10% in the localization accuracy compared to the baseline DNN model.


Microsoft's Windows Dev Kit 2023 lets developers tap AI processors on laptops

#artificialintelligence

At its Build conference in May, Microsoft debuted Project Volterra, a device powered by Qualcomm's Snapdragon platform designed to let developers explore "AI scenarios" via Qualcomm's Neural Processing SDK for Windows toolkit. Today, Volterra -- now called Windows Dev Kit 2023 -- officially goes on sale, priced at $599 and available from the Microsoft Store in Australia, Canada, China, France, Germany, Japan, the U.K. and the U.S. Here's how Microsoft describes it: With Windows Dev Kit 2023, developers will be able to bring their entire app development process onto one compact device, giving them everything they need to build Windows apps for Arm, on Arm. As previously announced, the Windows Dev Kit 2023 contains a dedicated AI processor, called the Hexagon processor, complimented by an Arm-based chip -- the Snapdragon 8cx Gen 3 -- both supplied by Qualcomm. It enables developers to build Arm-native and AI-powered apps alongside and with tools such as Visual Studio (version 17.4 runs natively on Arm), .NET 7 (which has Arm-specific performance improvements), VSCode, Microsoft Office and Teams and machine learning frameworks including PyTorch and TensorFlow. Microsoft's Windows Dev Kit 2023, which packs an Arm processor plus an AI accelerator chip. The Windows Dev Kit 2023 arrives alongside support in Windows for neural processing units (NPU), or dedicated chips tailored for AI- and machine learning-specific workloads.


Guided Generative Models using Weak Supervision for Detecting Object Spatial Arrangement in Overhead Images

arXiv.org Artificial Intelligence

The increasing availability and accessibility of numerous overhead images allows us to estimate and assess the spatial arrangement of groups of geospatial target objects, which can benefit many applications, such as traffic monitoring and agricultural monitoring. Spatial arrangement estimation is the process of identifying the areas which contain the desired objects in overhead images. Traditional supervised object detection approaches can estimate accurate spatial arrangement but require large amounts of bounding box annotations. Recent semi-supervised clustering approaches can reduce manual labeling but still require annotations for all object categories in the image. This paper presents the target-guided generative model (TGGM), under the Variational Auto-encoder (VAE) framework, which uses Gaussian Mixture Models (GMM) to estimate the distributions of both hidden and decoder variables in VAE. Modeling both hidden and decoder variables by GMM reduces the required manual annotations significantly for spatial arrangement estimation. Unlike existing approaches that the training process can only update the GMM as a whole in the optimization iterations (e.g., a "minibatch"), TGGM allows the update of individual GMM components separately in the same optimization iteration. Optimizing GMM components separately allows TGGM to exploit the semantic relationships in spatial data and requires only a few labels to initiate and guide the generative process. Our experiments shows that TGGM achieves results comparable to the state-of-the-art semi-supervised methods and outperforms unsupervised methods by 10% based on the $F_{1}$ scores, while requiring significantly fewer labeled data.


Early Anomaly Detection in Time Series: A Hierarchical Approach for Predicting Critical Health Episodes

arXiv.org Machine Learning

The early detection of anomalous events in time series data is essential in many domains of application. In this paper we deal with critical health events, which represent a significant cause of mortality in intensive care units of hospitals. The timely prediction of these events is crucial for mitigating their consequences and improving healthcare. One of the most common approaches to tackle early anomaly detection problems is standard classification methods. In this paper we propose a novel method that uses a layered learning architecture to address these tasks. One key contribution of our work is the idea of pre-conditional events, which denote arbitrary but computable relaxed versions of the event of interest. We leverage this idea to break the original problem into two hierarchical layers, which we hypothesize are easier to solve. The results suggest that the proposed approach leads to a better performance relative to state of the art approaches for critical health episode prediction.


Instance-Privacy Preserving Crowdsourcing

AAAI Conferences

Crowdsourcing is a technique to outsource tasks to a number of workers. Although crowdsourcing has many advantages, it gives rise to the risk that sensitive information may be leaked, which has limited the spread of its popularity. Task instances (data workers receive to process tasks) often contain sensitive information, which can be extracted by workers. For example, in an audio transcription task, an audio file corresponds to an instance, and the content of the audio (e.g., the abstract of a meeting) can be sensitive information. In this paper, we propose a quantitative analysis framework for the instance privacy problem. The proposed framework supplies us performance measures of instance privacy preserving protocols. As a case study, we apply the proposed framework to an instance clipping protocol and analyze the properties of the protocol. The protocol preserves privacy by clipping instances to limit the amount of information workers obtain. The results show that the protocol can balance task performance and instance privacy preservation. They also show that the proposed measure is consistent with standard measures, which validates the proposed measure.