Goto

Collaborating Authors

 sensing


Towards Trustworthy Wi-Fi Sensing: Systematic Evaluation of Deep Learning Model Robustness to Adversarial Attacks

Gopalakrishnan, Shreevanth Krishnaa, Hailes, Stephen

arXiv.org Artificial Intelligence

Machine learning has become integral to Channel State Information (CSI)-based human sensing systems and is expected to power applications such as device-free activity recognition and identity detection in future cellular and Wi-Fi generations. However, these systems rely on models whose decisions can be subtly perturbed, raising concerns for security and reliability in ubiquitous sensing. Quantifying and understanding the robustness of such models, defined as their ability to maintain accurate predictions under adversarial perturbations, is therefore critical before wireless sensing can be safely deployed in real-world environments. This work presents a systematic evaluation of the robustness of CSI deep learning models under diverse threat models (white-box, black-box/transfer, and universal perturbations) and varying degrees of attack realism. We establish a framework to compare compact temporal autoencoder models with larger deep architectures across three public datasets, quantifying how model scale, training regime, and physical constraints influence robustness. Our experiments show that smaller models, while efficient and equally performant on clean data, are markedly less robust. We further confirm that physically realizable signal-space perturbations, designed to be feasible in real wireless channels, significantly reduce attack success compared to unconstrained feature-space attacks. Adversarial training mitigates these vulnerabilities, improving mean robust accuracy with only moderate degradation in clean performance across both model classes. As wireless sensing advances towards reliable, cross-domain operation, these findings provide quantitative baselines for robustness estimation and inform design principles for secure and trustworthy human-centered sensing systems.


Deep Network for the Integrated 3D Sensing of Multiple People in Natural Images

Neural Information Processing Systems

We present MubyNet -- a feed-forward, multitask, bottom up system for the integrated localization, as well as 3d pose and shape estimation, of multiple people in monocular images. The challenge is the formal modeling of the problem that intrinsically requires discrete and continuous computation, e.g.


SpikeATac: A Multimodal Tactile Finger with Taxelized Dynamic Sensing for Dexterous Manipulation

Chang, Eric T., Ballentine, Peter, He, Zhanpeng, Kim, Do-Gon, Jiang, Kai, Liang, Hua-Hsuan, Palacios, Joaquin, Wang, William, Piacenza, Pedro, Kymissis, Ioannis, Ciocarlie, Matei

arXiv.org Artificial Intelligence

In this work, we introduce SpikeATac, a multimodal tactile finger combining a taxelized and highly sensitive dynamic response (PVDF) with a static transduction method (capacitive) for multimodal touch sensing. Named for its `spiky' response, SpikeATac's 16-taxel PVDF film sampled at 4 kHz provides fast, sensitive dynamic signals to the very onset and breaking of contact. We characterize the sensitivity of the different modalities, and show that SpikeATac provides the ability to stop quickly and delicately when grasping fragile, deformable objects. Beyond parallel grasping, we show that SpikeATac can be used in a learning-based framework to achieve new capabilities on a dexterous multifingered robot hand. We use a learning recipe that combines reinforcement learning from human feedback with tactile-based rewards to fine-tune the behavior of a policy to modulate force. Our hardware platform and learning pipeline together enable a difficult dexterous and contact-rich task that has not previously been achieved: in-hand manipulation of fragile objects. Videos are available at \href{https://roamlab.github.io/spikeatac/}{roamlab.github.io/spikeatac}.


Early Detection of Branched Broomrape (Phelipanche ramosa) Infestation in Tomato Crops Using Leaf Spectral Analysis and Machine Learning

Narimani, Mohammadreza, Pourreza, Alireza, Moghimi, Ali, Farajpoor, Parastoo, Jafarbiglu, Hamid, Mesgaran, Mohsen B.

arXiv.org Artificial Intelligence

Branched broomrape (Phelipanche ramosa) is a chlorophyll-deficient parasitic weed that threatens tomato production by extracting nutrients from the host. We investigate early detection using leaf-level spectral reflectance (400-2500 nm) and ensemble machine learning. In a field experiment in Woodland, California, we tracked 300 tomato plants across growth stages defined by growing degree days (GDD). Leaf reflectance was acquired with a portable spectrometer and preprocessed (band denoising, 1 nm interpolation, Savitzky-Golay smoothing, correlation-based band reduction). Clear class differences were observed near 1500 nm and 2000 nm water absorption features, consistent with reduced leaf water content in infected plants at early stages. An ensemble combining Random Forest, XGBoost, SVM with RBF kernel, and Naive Bayes achieved 89% accuracy at 585 GDD, with recalls of 0.86 (infected) and 0.93 (noninfected). Accuracy declined at later stages (e.g., 69% at 1568 GDD), likely due to senescence and weed interference. Despite the small number of infected plants and environmental confounders, results show that proximal sensing with ensemble learning enables timely detection of broomrape before canopy symptoms are visible, supporting targeted interventions and reduced yield losses.


Cloud-Aware SAR Fusion for Enhanced Optical Sensing in Space Missions

Bui, Trong-An, Le, Thanh-Thoai

arXiv.org Artificial Intelligence

This research presents a Cloud-Attentive Reconstruction Framework that integrates SAR-optical feature fusion with deep learning-based image reconstruction to generate cloud-free optical imagery. The proposed framework employs an attention-driven feature fusion mechanism to align complementary structural information from Synthetic Aperture Radar (SAR) with spectral characteristics from optical data. Furthermore, a cloud-aware model update strategy introduces adaptive loss weighting to prioritize cloud-occluded regions, enhancing reconstruction accuracy. Experimental results demonstrate that the proposed method outperforms existing approaches, achieving a PSNR of 31.01 dB, SSIM of 0.918, and MAE of 0.017.


Cross-Channel Unlabeled Sensing over a Union of Signal Subspaces

Koka, Taulant, Tsakiris, Manolis C., Haro, Benjamín Béjar, Muma, Michael

arXiv.org Artificial Intelligence

Cross-channel unlabeled sensing addresses the problem of recovering a multi-channel signal from measurements that were shuffled across channels. This work expands the cross-channel unlabeled sensing framework to signals that lie in a union of subspaces. The extension allows for handling more complex signal structures and broadens the framework to tasks like compressed sensing. These mismatches between samples and channels often arise in applications such as whole-brain calcium imaging of freely moving organisms or multi-target tracking. We improve over previous models by deriving tighter bounds on the required number of samples for unique reconstruction, while supporting more general signal types. The approach is validated through an application in whole-brain calcium imaging, where organism movements disrupt sample-to-neuron mappings. This demonstrates the utility of our framework in real-world settings with imprecise sample-channel associations, achieving accurate signal reconstruction.


Transformers Meet Hyperspectral Imaging: A Comprehensive Study of Models, Challenges and Open Problems

Zhang, Guyang, Abdulla, Waleed

arXiv.org Artificial Intelligence

Transformers have become the architecture of choice for learning long-range dependencies, yet their adoption in hyperspectral imaging (HSI) is still emerging. We reviewed more than 300 papers published up to 2025 and present the first end-to-end survey dedicated to Transformer-based HSI classification. The study categorizes every stage of a typical pipeline-pre-processing, patch or pixel tokenization, positional encoding, spatial-spectral feature extraction, multi-head self-attention variants, skip connections, and loss design-and contrasts alternative design choices with the unique spatial-spectral properties of HSI. We map the field's progress against persistent obstacles: scarce labeled data, extreme spectral dimensionality, computational overhead, and limited model explainability. Finally, we outline a research agenda prioritizing valuable public data sets, lightweight on-edge models, illumination and sensor shifts robustness, and intrinsically interpretable attention mechanisms. Our goal is to guide researchers in selecting, combining, or extending Transformer components that are truly fit for purpose for next-generation HSI applications.


Towards Efficient Benchmarking of Foundation Models in Remote Sensing: A Capabilities Encoding Approach

Adorni, Pierre, Pham, Minh-Tan, May, Stéphane, Lefèvre, Sébastien

arXiv.org Artificial Intelligence

F oundation models constitute a significant advancement in computer vision: after a single, albeit costly, training phase, they can address a wide array of tasks. In the field of Earth observation, over 75 remote sensing vision foundation models have been developed in the past four years. However, none has consistently outperformed the others across all available downstream tasks. T o facilitate their comparison, we propose a cost-effective method for predicting a model's performance on multiple downstream tasks without the need for fine-tuning on each one. This method is based on what we call "capabilities encoding. " The utility of this novel approach is twofold: we demonstrate its potential to simplify the selection of a foundation model for a given new task, and we employ it to offer a fresh perspective on the existing literature, suggesting avenues for future research.


VibeCheck: Using Active Acoustic Tactile Sensing for Contact-Rich Manipulation

Zhang, Kaidi, Kim, Do-Gon, Chang, Eric T., Liang, Hua-Hsuan, He, Zhanpeng, Lampo, Kathryn, Wu, Philippe, Kymissis, Ioannis, Ciocarlie, Matei

arXiv.org Artificial Intelligence

The acoustic response of an object can reveal a lot about its global state, for example its material properties or the extrinsic contacts it is making with the world. In this work, we build an active acoustic sensing gripper equipped with two piezoelectric fingers: one for generating signals, the other for receiving them. By sending an acoustic vibration from one finger to the other through an object, we gain insight into an object's acoustic properties and contact state. We use this system to classify objects, estimate grasping position, estimate poses of internal structures, and classify the types of extrinsic contacts an object is making with the environment. Using our contact type classification model, we tackle a standard long-horizon manipulation problem: peg insertion. We use a simple simulated transition model based on the performance of our sensor to train an imitation learning policy that is robust to imperfect predictions from the classifier. We finally demonstrate the policy on a UR5 robot with active acoustic sensing as the only feedback.


JL1-CD: A New Benchmark for Remote Sensing Change Detection and a Robust Multi-Teacher Knowledge Distillation Framework

Liu, Ziyuan, Zhu, Ruifei, Gao, Long, Zhou, Yuanxiu, Ma, Jingyu, Gu, Yuantao

arXiv.org Artificial Intelligence

Xu et al. [24] introduce a semi-supervised label and embedding consistency network (SS-LEC) for ORSI scene classification, which strategically enforces consistency across augmentations and stages of training. Li et al. [25] propose SemiCD-VL, a VLM-guided semi-supervised change detection method that synthesizes pseudo labels via a mixed change event generation strategy, achieving significant performance gains over FixMatch and SOT A unsupervised methods. However, DL-based CD methods generally face two major challenges: the scarcity of high-quality, high-resolution, all-inclusive CD datasets and limitations in handling highly dynamic change areas. Although numerous CD datasets have been constructed and proposed, they are often tailored to specific scenarios, which restricts the generalization capabilities of the algorithms. For instance, models trained on datasets focused on human-induced changes often fail to perform effectively when confronted with natural change scenarios.