Performance Analysis
Network-Based Detection of Autism Spectrum Disorder Using Sustainable and Non-invasive Salivary Biomarkers
Fernandes, Janayna M., Sabino-Silva, Robinson, Carneiro, Murillo G.
Autism Spectrum Disorder (ASD) lacks reliable biological markers, delaying early diagnosis. Using 159 salivary samples analyzed by ATR-FTIR spectroscopy, we developed GANet, a genetic algorithm-based network optimization framework leveraging PageRank and Degree for importance-based feature characterization. GANet systematically optimizes network structure to extract meaningful patterns from high-dimensional spectral data. It achieved superior performance compared to linear discriminant analysis, support vector machines, and deep learning models, reaching 0.78 accuracy, 0.61 sensitivity, 0.90 specificity, and a 0.74 harmonic mean. These results demonstrate GANet's potential as a robust, bio-inspired, non-invasive tool for precise ASD detection and broader spectral-based health applications.
Efficient Detection of Objects Near a Robot Manipulator via Miniature Time-of-Flight Sensors
Sifferman, Carter, Gupta, Mohit, Gleicher, Michael
Abstract--We provide a method for detecting and localizing objects near a robot arm using arm-mounted miniature time-of-flight sensors. A key challenge when using arm-mounted sensors is differentiating between the robot itself and external objects in sensor measurements. T o address this challenge, we propose a computationally lightweight method which utilizes the raw time-of-flight information captured by many off-the-shelf, low-resolution time-of-flight sensor . We build an empirical model of expected sensor measurements in the presence of the robot alone, and use this model at runtime to detect objects in proximity to the robot. In addition to avoiding robot self-detections in common sensor configurations, the proposed method enables extra flexibility in sensor placement, unlocking configurations which achieve more efficient coverage of a radius around the robot arm. Our method can detect small objects near the arm and localize the position of objects along the length of a robot link to reasonable precision. We evaluate the performance of the method with respect to object type, location, and ambient light level, and identify limiting factors on performance inherent in the measurement principle. The proposed method has potential applications in collision avoidance and in facilitating safe human-robot interaction. ETECTION of objects near a robot arm is useful for tasks such as collision avoidance [1], [2] or to enable proximity-based human-robot interactions [3]. Externally mounted cameras are one way of detecting such objects, but they suffer from occlusion and require the robot to remain in view of the cameras, limiting their practicality when used with mobile manipulators. Therefore, we seek a solution which uses sensors mounted on the robot.
Predicting the descent into extremism and terrorism
Lane, R. O., Holmes, W. J., Taylor, C. J., State-Davey, H. M., Wragge, A. J.
This paper proposes an approach for automatically analysing and tracking statements in material gathered online and detecting whether the authors of the statements are likely to be involved in extremism or terrorism. The proposed system comprises: online collation of statements that are then encoded in a form amenable to machine learning (ML), an ML component to classify the encoded text, a tracker, and a visualisation system for analysis of results. The detection and tracking concept has been tested using quotes made by terrorists, extremists, campaigners, and politicians, obtained from wikiquote.org. A set of features was extracted for each quote using the state-of-the-art Universal Sentence Encoder (Cer et al. 2018), which produces 512-dimensional vectors. The data were used to train and test a support vector machine (SVM) classifier using 10-fold cross-validation. The system was able to correctly detect intentions and attitudes associated with extremism 81% of the time and terrorism 97% of the time, using a dataset of 839 quotes. This accuracy was higher than that which was achieved for a simple baseline system based on n-gram text features. Tracking techniques were also used to perform a temporal analysis of the data, with each quote considered to be a noisy measurement of a person's state of mind. It was demonstrated that the tracking algorithms were able to detect both trends over time and sharp changes in attitude that could be attributed to major events.
The Missing Piece: A Case for Pre-Training in 3D Medical Object Detection
Eckstein, Katharina, Ulrich, Constantin, Baumgartner, Michael, Kächele, Jessica, Bounias, Dimitrios, Wald, Tassilo, Floca, Ralf, Maier-Hein, Klaus H.
Large-scale pre-training holds the promise to advance 3D medical object detection, a crucial component of accurate computer-aided diagnosis. Yet, it remains underexplored compared to segmentation, where pre-training has already demonstrated significant benefits. Existing pre-training approaches for 3D object detection rely on 2D medical data or natural image pre-training, failing to fully leverage 3D volumetric information. In this work, we present the first systematic study of how existing pre-training methods can be integrated into state-of-the-art detection architectures, covering both CNNs and Transformers. Our results show that pre-training consistently improves detection performance across various tasks and datasets. Notably, reconstruction-based self-supervised pre-training outperforms supervised pre-training, while contrastive pre-training provides no clear benefit for 3D medical object detection.
From Data to Diagnosis: A Large, Comprehensive Bone Marrow Dataset and AI Methods for Childhood Leukemia Prediction
Höfener, Henning, Kock, Farina, Pontones, Martina, Ghete, Tabita, Pfrang, David, Dickel, Nicholas, Kunz, Meik, Schacherer, Daniela P., Clunie, David A., Fedorov, Andrey, Westphal, Max, Metzler, Markus
Leukemia diagnosis primarily relies on manual microscopic analysis of bone marrow morphology supported by additional laboratory parameters, making it complex and time consuming. While artificial intelligence (AI) solutions have been proposed, most utilize private datasets and only cover parts of the diagnostic pipeline. Therefore, we present a large, high-quality, publicly available leukemia bone marrow dataset spanning the entire diagnostic process, from cell detection to diagnosis. Using this dataset, we further propose methods for cell detection, cell classification, and diagnosis prediction. The dataset comprises 246 pediatric patients with diagnostic, clinical and laboratory information, over 40 000 cells with bounding box annotations and more than 28 000 of these with high-quality class labels, making it the most comprehensive dataset publicly available. Evaluation of the AI models yielded an average precision of 0.96 for the cell detection, an area under the curve of 0.98, and an F1-score of 0.61 for the 33-class cell classification, and a mean F1-score of 0.90 for the diagnosis prediction using predicted cell counts. While the proposed Höfener et al. - Bone Marrow Dataset & Methods for Childhood Leukemia Page 3 approaches demonstrate their usefulness for AI-assisted diagnostics, the dataset will foster further research and development in the field, ultimately contributing to more precise diagnoses and improved patient outcomes.
Contrastive Learning with Spectrum Information Augmentation in Abnormal Sound Detection
Meng, Xinxin, Guo, Jiangtao, Zhang, Yunxiang, Huang, Shun
The outlier exposure method is an effective approach to address the unsupervised anomaly sound detection problem. The key focus of this method is how to make the model learn the distribution space of normal data. Based on biological perception and data analysis, it is found that anomalous audio and noise often have higher frequencies. Therefore, we propose a data augmentation method for high-frequency information in contrastive learning. This enables the model to pay more attention to the low-frequency information of the audio, which represents the normal operational mode of the machine. We evaluated the proposed method on the DCASE 2020 Task 2. The results showed that our method outperformed other contrastive learning methods used on this dataset. We also evaluated the generalizability of our method on the DCASE 2022 Task 2 dataset.
Hybrid Deep Learning-Federated Learning Powered Intrusion Detection System for IoT/5G Advanced Edge Computing Network
Baidar, Rasil, Maric, Sasa, Abbas, Robert
The exponential expansion of IoT and 5G-Advanced applications has enlarged the attack surface for DDoS, malware, and zero-day intrusions. We propose an intrusion detection system that fuses a convolutional neural network (CNN), a bidirectional LSTM (BiLSTM), and an autoencoder (AE) bottleneck within a privacy-preserving federated learning (FL) framework. The CNN-BiLSTM branch captures local and gated cross-feature interactions, while the AE emphasizes reconstruction-based anomaly sensitivity. Training occurs across edge devices without sharing raw data. On UNSW-NB15 (binary), the fused model attains AUC 99.59 percent and F1 97.36 percent; confusion-matrix analysis shows balanced error rates with high precision and recall. Average inference time is approximately 0.0476 ms per sample on our test hardware, which is well within the less than 10 ms URLLC budget, supporting edge deployment. We also discuss explainability, drift tolerance, and FL considerations for compliant, scalable 5G-Advanced IoT security.
Dual-Mode Visual System for Brain-Computer Interfaces: Integrating SSVEP and P300 Responses
Kasawala, Ekgari, Mouli, Surej
In brain-computer interface (BCI) systems, steady-state visual evoked potentials (SSVEP) and P300 responses have achieved widespread implementation owing to their superior information transfer rates (ITR) and minimal training requirements. These neurophysiological signals have exhibited robust efficacy and versatility in external device control, demonstrating enhanced precision and scalability. However, conventional implementations predominantly utilise liquid crystal display (LCD)-based visual stimulation paradigms, which present limitations in practical deployment scenarios. This investigation presents the development and evaluation of a novel light-emitting diode (LED)-based dual stimulation apparatus designed to enhance SSVEP classification accuracy through the integration of both SSVEP and P300 paradigms. The system employs four distinct frequencies, 7 Hz, 8 Hz, 9 Hz, and 10 Hz, corresponding to forward, backward, right, and left directional controls, respectively. Oscilloscopic verification confirmed the precision of these stimulation frequencies. Real-time feature extraction was accomplished through the concurrent analysis of maximum Fast Fourier Transform (FFT) amplitude and P300 peak detection to ascertain user intent. Directional control was determined by the frequency exhibiting maximal amplitude characteristics. The visual stimulation hardware demonstrated minimal frequency deviation, with error differentials ranging from 0.15%to 0.20%across all frequencies. The implemented signal processing algorithm successfully discriminated all four stimulus frequencies whilst correlating them with their respective P300 event markers. Classification accuracy was evaluated based on correct task intention recognition. The proposed hybrid system achieved a mean classification accuracy of 86.25%, coupled with an average ITR of 42.08 bits per minute (bpm).
Real, Fake, or Manipulated? Detecting Machine-Influenced Text
Wang, Yitong, Zhang, Zhongping, Piana, Margherita, Zhou, Zheng, Gerstoft, Peter, Plummer, Bryan A.
Large Language Model (LLMs) can be used to write or modify documents, presenting a challenge for understanding the intent behind their use. For example, benign uses may involve using LLM on a human-written document to improve its grammar or to translate it into another language. However, a document entirely produced by a LLM may be more likely to be used to spread misinformation than simple translation (\eg, from use by malicious actors or simply by hallucinating). Prior works in Machine Generated Text (MGT) detection mostly focus on simply identifying whether a document was human or machine written, ignoring these fine-grained uses. In this paper, we introduce a HiErarchical, length-RObust machine-influenced text detector (HERO), which learns to separate text samples of varying lengths from four primary types: human-written, machine-generated, machine-polished, and machine-translated. HERO accomplishes this by combining predictions from length-specialist models that have been trained with Subcategory Guidance. Specifically, for categories that are easily confused (\eg, different source languages), our Subcategory Guidance module encourages separation of the fine-grained categories, boosting performance. Extensive experiments across five LLMs and six domains demonstrate the benefits of our HERO, outperforming the state-of-the-art by 2.5-3 mAP on average.
A Weak Supervision Approach for Monitoring Recreational Drug Use Effects in Social Media
Prieto-Santamaría, Lucía, Iglesias, Alba Cortés, Giné, Claudio Vidal, Calderón, Fermín Fernández, Lozano, Óscar M., Rodríguez-González, Alejandro
Understanding the real-world effects of recreational drug use remains a critical challenge in public health and biomedical research, especially as traditional surveillance systems often underrepresent user experiences. In this study, we leverage social media (specifically Twitter) as a rich and unfiltered source of user-reported effects associated with three emerging psychoactive substances: ecstasy, GHB, and 2C-B. By combining a curated list of slang terms with biomedical concept extraction via MetaMap, we identified and weakly annotated over 92,000 tweets mentioning these substances. Each tweet was labeled with a polarity reflecting whether it reported a positive or negative effect, following an expert-guided heuristic process. We then performed descriptive and comparative analyses of the reported phenotypic outcomes across substances and trained multiple machine learning classifiers to predict polarity from tweet content, accounting for strong class imbalance using techniques such as cost-sensitive learning and synthetic oversampling. The top performance on the test set was obtained from eXtreme Gradient Boosting with cost-sensitive learning (F1 = 0.885, AUPRC = 0.934). Our findings reveal that Twitter enables the detection of substance-specific phenotypic effects, and that polarity classification models can support real-time pharmacovigilance and drug effect characterization with high accuracy.