localization model
Radiance-Field Reinforced Pretraining: Scaling Localization Models with Unlabeled Wireless Signals
Wang, Guosheng, Wang, Shen, Yang, Lei
Radio frequency (RF)-based indoor localization offers significant promise for applications such as indoor navigation, augmented reality, and pervasive computing. While deep learning has greatly enhanced localization accuracy and robustness, existing localization models still face major challenges in cross-scene generalization due to their reliance on scene-specific labeled data. To address this, we introduce Radiance-Field Reinforced Pretraining (RFRP). This novel self-supervised pretraining framework couples a large localization model (LM) with a neural radio-frequency radiance field (RF-NeRF) in an asymmetrical autoencoder architecture. In this design, the LM encodes received RF spectra into latent, position-relevant representations, while the RF-NeRF decodes them to reconstruct the original spectra. This alignment between input and output enables effective representation learning using large-scale, unlabeled RF data, which can be collected continuously with minimal effort. To this end, we collected RF samples at 7,327,321 positions across 100 diverse scenes using four common wireless technologies--RFID, BLE, WiFi, and IIoT. Data from 75 scenes were used for training, and the remaining 25 for evaluation. Experimental results show that the RFRP-pretrained LM reduces localization error by over 40% compared to non-pretrained models and by 21% compared to those pretrained using supervised learning.
- North America > United States (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > California (0.04)
- North America > Canada (0.04)
LiGen: GAN-Augmented Spectral Fingerprinting for Indoor Positioning
Lin, Jie, Lee, Hsun-Yu, Li, Ho-Ming, Wu, Fang-Jing
Accurate and robust indoor localization is critical for smart building applications, yet existing Wi-Fi-based systems are often vulnerable to environmental conditions. This work presents a novel indoor localization system, called LiGen, that leverages the spectral intensity patterns of ambient light as fingerprints, offering a more stable and infrastructure-free alternative to radio signals. To address the limited spectral data, we design a data augmentation framework based on generative adversarial networks (GANs), featuring two variants: PointGAN, which generates fingerprints conditioned on coordinates, and FreeGAN, which uses a weak localization model to label unconditioned samples. Our positioning model, leveraging a Multi-Layer Perceptron (MLP) architecture to train on synthesized data, achieves submeter-level accuracy, outperforming Wi-Fi-based baselines by over 50\%. LiGen also demonstrates strong robustness in cluttered environments. To the best of our knowledge, this is the first system to combine spectral fingerprints with GAN-based data augmentation for indoor localization.
- Asia > Taiwan (0.15)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > California (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Developing an AI-Guided Assistant Device for the Deaf and Hearing Impaired
This study aims to develop a deep learning system for an accessibility device for the deaf or hearing impaired. The device will accurately localize and identify sound sources in real time. This study will fill an important gap in current research by leveraging machine learning techniques to target the underprivileged community. The system includes three main components. 1. JerryNet: A custom designed CNN architecture that determines the direction of arrival (DoA) for nine possible directions. 2. Audio Classification: This model is based on fine-tuning the Contrastive Language-Audio Pretraining (CLAP) model to identify the exact sound classes only based on audio. 3. Multimodal integration model: This is an accurate sound localization model that combines audio, visual, and text data to locate the exact sound sources in the images. The part consists of two modules, one object detection using Yolov9 to generate all the bounding boxes of the objects, and an audio visual localization model to identify the optimal bounding box using complete Intersection over Union (CIoU). The hardware consists of a four-microphone rectangular formation and a camera mounted on glasses with a wristband for displaying necessary information like direction. On a custom collected data set, JerryNet achieved a precision of 91. 1% for the sound direction, outperforming all the baseline models. The CLAP model achieved 98.5% and 95% accuracy on custom and AudioSet datasets, respectively. The audio-visual localization model within component 3 yielded a cIoU of 0.892 and an AUC of 0.658, surpassing other similar models. There are many future potentials to this study, paving the way to creating a new generation of accessibility devices.
AI-Driven Automated Tool for Abdominal CT Body Composition Analysis in Gastrointestinal Cancer Management
Nan, Xinyu, He, Meng, Chen, Zifan, Dong, Bin, Tang, Lei, Zhang, Li
The incidence of gastrointestinal cancers remains significantly high, particularly in China, emphasizing the importance of accurate prognostic assessments and effective treatment strategies. Research shows a strong correlation between abdominal muscle and fat tissue composition and patient outcomes. However, existing manual methods for analyzing abdominal tissue composition are time-consuming and costly, limiting clinical research scalability. To address these challenges, we developed an AI-driven tool for automated analysis of abdominal CT scans to effectively identify and segment muscle, subcutaneous fat, and visceral fat. Our tool integrates a multi-view localization model and a high-precision 2D nnUNet-based segmentation model, demonstrating a localization accuracy of 90% and a Dice Score Coefficient of 0.967 for segmentation. Furthermore, it features an interactive interface that allows clinicians to refine the segmentation results, ensuring high-quality outcomes effectively. Our tool offers a standardized method for effectively extracting critical abdominal tissues, potentially enhancing the management and treatment for gastrointestinal cancers. The code is available at https://github.com/NanXinyu/AI-Tool4Abdominal-Seg.git}{https://github.com/NanXinyu/AI-Tool4Abdominal-Seg.git.
- Research Report > Experimental Study (0.34)
- Research Report > New Finding (0.34)
- Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Gastric Cancer (0.93)
On Adversarial Attacks In Acoustic Drone Localization
Shor, Tamir, Baskin, Chaim, Bronstein, Alex
Multi-rotor aerial autonomous vehicles (MAVs, more widely known as "drones") have been generating increased interest in recent years due to their growing applicability in a vast and diverse range of fields (e.g., agriculture, commercial delivery, search and rescue). The sensitivity of visual-based methods to lighting conditions and occlusions had prompted growing study of navigation reliant on other modalities, such as acoustic sensing. A major concern in using drones in scale for tasks in non-controlled environments is the potential threat of adversarial attacks over their navigational systems, exposing users to mission-critical failures, security breaches, and compromised safety outcomes that can endanger operators and bystanders. While previous work shows impressive progress in acoustic-based drone localization, prior research in adversarial attacks over drone navigation only addresses visual sensing-based systems. In this work, we aim to compensate for this gap by supplying a comprehensive analysis of the effect of PGD adversarial attacks over acoustic drone localization. We furthermore develop an algorithm for adversarial perturbation recovery, capable of markedly diminishing the affect of such attacks in our setting. The code for reproducing all experiments will be released upon publication.
- North America > United States > Kansas > Sheridan County (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > Austria (0.04)
- (2 more...)
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
SGP-RI: A Real-Time-Trainable and Decentralized IoT Indoor Localization Model Based on Sparse Gaussian Process with Reduced-Dimensional Inputs
Tang, Zhe, Li, Sihao, Huang, Zichen, Yang, Guandong, Kim, Kyeong Soo, Smith, Jeremy S.
Internet of Things (IoT) devices are deployed in the filed, there is an enormous amount of untapped potential in local computing on those IoT devices. Harnessing this potential for indoor localization, therefore, becomes an exciting research area. Conventionally, the training and deployment of indoor localization models are based on centralized servers with substantial computational resources. This centralized approach faces several challenges, including the database's inability to accommodate the dynamic and unpredictable nature of the indoor electromagnetic environment, the model retraining costs, and the susceptibility of centralized servers to security breaches. To mitigate these challenges we aim to amalgamate the offline and online phases of traditional indoor localization methods using a real-time-trainable and decentralized IoT indoor localization model based on Sparse Gaussian Process with Reduced-dimensional Inputs (SGP-RI), where the number and dimension of the input data are reduced through reference point and wireless access point filtering, respectively. The experimental results based on a multi-building and multi-floor static database as well as a single-building and single-floor dynamic database, demonstrate that the proposed SGP-RI model with less than half the training samples as inducing inputs can produce comparable localization performance to the standard Gaussian Process model with the whole training samples. The SGP-RI model enables the decentralization of indoor localization, facilitating its deployment to resource-constrained IoT devices, and thereby could provide enhanced security and privacy, reduced costs, and network dependency. Also, the model's capability of real-time training makes it possible to quickly adapt to the time-varying indoor electromagnetic environment.
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Europe > United Kingdom (0.04)
- Europe > Spain (0.04)
- Asia > Japan (0.04)
MambaLoc: Efficient Camera Localisation via State Space Model
Wang, Jialu, Zhou, Kaichen, Markham, Andrew, Trigoni, Niki
Location information is pivotal for the automation and intelligence of terminal devices and edge-cloud IoT systems, such as autonomous vehicles and augmented reality. However, achieving reliable positioning across diverse IoT applications remains challenging due to significant training costs and the necessity of densely collected data. To tackle these issues, we have innovatively applied the selective state space (SSM) model to visual localization, introducing a new model named MambaLoc. The proposed model demonstrates exceptional training efficiency by capitalizing on the SSM model's strengths in efficient feature extraction, rapid computation, and memory optimization, and it further ensures robustness in sparse data environments due to its parameter sparsity. Additionally, we propose the Global Information Selector (GIS), which leverages selective SSM to implicitly achieve the efficient global feature extraction capabilities of Non-local Neural Networks. This design leverages the computational efficiency of the SSM model alongside the Non-local Neural Networks' capacity to capture long-range dependencies with minimal layers. Consequently, the GIS enables effective global information capture while significantly accelerating convergence. Our extensive experimental validation using public indoor and outdoor datasets first demonstrates our model's effectiveness, followed by evidence of its versatility with various existing localization models. Our code and models are publicly available to support further research and development in this area.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Aligning Programming Language and Natural Language: Exploring Design Choices in Multi-Modal Transformer-Based Embedding for Bug Localization
Chakraborty, Partha, Arumugam, Venkatraman, Nagappan, Meiyappan
Bug localization refers to the identification of source code files which is in a programming language and also responsible for the unexpected behavior of software using the bug report, which is a natural language. As bug localization is labor-intensive, bug localization models are employed to assist software developers. Due to the domain difference between source code files and bug reports, modern bug-localization systems, based on deep learning models, rely heavily on embedding techniques that project bug reports and source code files into a shared vector space. The creation of an embedding involves several design choices, but the impact of these choices on the quality of embedding and the performance of bug localization models remains unexplained in current research. To address this gap, our study evaluated 14 distinct embedding models to gain insights into the effects of various design choices. Subsequently, we developed bug localization models utilizing these embedding models to assess the influence of these choices on the performance of the localization models. Our findings indicate that the pre-training strategies significantly affect the quality of the embedding. Moreover, we discovered that the familiarity of the embedding models with the data has a notable impact on the bug localization model's performance. Notably, when the training and testing data are collected from different projects, the performance of the bug localization models exhibits substantial fluctuations.