Goto

Collaborating Authors

 Performance Analysis


Adaptation of Foundation Models for Medical Image Analysis: Strategies, Challenges, and Future Directions

arXiv.org Artificial Intelligence

Foundation models (FMs) have emerged as a transformative paradigm in medical image analysis, offering the potential to provide generalizable, task-agnostic solutions across a wide range of clinical tasks and imaging modalities. Their capacity to learn transferable representations from large-scale data has the potential to address the limitations of conventional task-specific models. However, adaptation of FMs to real-world clinical practice remains constrained by key challenges, including domain shifts, limited availability of high-quality annotated data, substantial computational demands, and strict privacy requirements. This review presents a comprehensive assessment of strategies for adapting FMs to the specific demands of medical imaging. We examine approaches such as supervised fine-tuning, domain-specific pretraining, parameter-efficient fine-tuning, self-supervised learning, hybrid methods, and multimodal or cross-modal frameworks. For each, we evaluate reported performance gains, clinical applicability, and limitations, while identifying trade-offs and unresolved challenges that prior reviews have often overlooked. Beyond these established techniques, we also highlight emerging directions aimed at addressing current gaps. These include continual learning to enable dynamic deployment, federated and privacy-preserving approaches to safeguard sensitive data, hybrid self-supervised learning to enhance data efficiency, data-centric pipelines that combine synthetic generation with human-in-the-loop validation, and systematic benchmarking to assess robust generalization under real-world clinical variability. By outlining these strategies and associated research gaps, this review provides a roadmap for developing adaptive, trustworthy, and clinically integrated FMs capable of meeting the demands of real-world medical imaging.


AI for pRedicting Exacerbations in KIDs with aSthma (AIRE-KIDS)

arXiv.org Artificial Intelligence

Recurrent exacerbations remain a common yet preventable outcome for many children with asthma. Machine learning (ML) algorithms using electronic medical records (EMR) could allow accurate identification of children at risk for exacerbations and facilitate referral for preventative comprehensive care to avoid this morbidity. We developed ML algorithms to predict repeat severe exacerbations (i.e. asthma-related emergency department (ED) visits or future hospital admissions) for children with a prior asthma ED visit at a tertiary care children's hospital. Retrospective pre-COVID19 (Feb 2017 - Feb 2019, N=2716) Epic EMR data from the Children's Hospital of Eastern Ontario (CHEO) linked with environmental pollutant exposure and neighbourhood marginalization information was used to train various ML models. We used boosted trees (LGBM, XGB) and 3 open-source large language model (LLM) approaches (DistilGPT2, Llama 3.2 1B and Llama-8b-UltraMedical). Models were tuned and calibrated then validated in a second retrospective post-COVID19 dataset (Jul 2022 - Apr 2023, N=1237) from CHEO. Models were compared using the area under the curve (AUC) and F1 scores, with SHAP values used to determine the most predictive features. The LGBM ML model performed best with the most predictive features in the final AIRE-KIDS_ED model including prior asthma ED visit, the Canadian triage acuity scale, medical complexity, food allergy, prior ED visits for non-asthma respiratory diagnoses, and age for an AUC of 0.712, and F1 score of 0.51. This is a nontrivial improvement over the current decision rule which has F1=0.334. While the most predictive features in the AIRE-KIDS_HOSP model included medical complexity, prior asthma ED visit, average wait time in the ED, the pediatric respiratory assessment measure score at triage and food allergy.


Integrating Visual and X-Ray Machine Learning Features in the Study of Paintings by Goya

arXiv.org Artificial Intelligence

Art authentication of Francisco Goya's works presents complex computational challenges due to his heterogeneous stylistic evolution and extensive historical patterns of forgery. We introduce a novel multimodal machine learning framework that applies identical feature extraction techniques to both visual and X-ray radiographic images of Goya paintings. The unified feature extraction pipeline incorporates Grey-Level Co-occurrence Matrix descriptors, Local Binary Patterns, entropy measures, energy calculations, and colour distribution analysis applied consistently across both imaging modalities. The extracted features from both visual and X-ray images are processed through an optimised One-Class Support Vector Machine with hyperparameter tuning. Using a dataset of 24 authenticated Goya paintings with corresponding X-ray images, split into an 80/20 train-test configuration with 10-fold cross-validation, the framework achieves 97.8% classification accuracy with a 0.022 false positive rate. Case study analysis of ``Un Gigante'' demonstrates the practical efficacy of our pipeline, achieving 92.3% authentication confidence through unified multimodal feature analysis. Our results indicate substantial performance improvement over single-modal approaches, establishing the effectiveness of applying identical computational methods to both visual and radiographic imagery in art authentication applications.


Advancing Machine-Generated Text Detection from an Easy to Hard Supervision Perspective

arXiv.org Artificial Intelligence

Existing machine-generated text (MGT) detection methods implicitly assume labels as the "golden standard". However, we reveal boundary ambiguity in MGT detection, implying that traditional training paradigms are inexact. Moreover, limitations of human cognition and the superintelligence of detectors make inexact learning widespread and inevitable. To this end, we propose an easy-to-hard enhancement framework to provide reliable supervision under such inexact conditions. Distinct from knowledge distillation, our framework employs an easy supervisor targeting relatively simple longer-text detection tasks (despite weaker capabilities), to enhance the more challenging target detector. Firstly, longer texts targeted by supervisors theoretically alleviate the impact of inexact labels, laying the foundation for reliable supervision. Secondly, by structurally incorporating the detector into the supervisor, we theoretically model the supervisor as a lower performance bound for the detector. Thus, optimizing the supervisor indirectly optimizes the detector, ultimately approximating the underlying "golden" labels. Extensive experiments across diverse practical scenarios, including cross-LLM, cross-domain, mixed text, and paraphrase attacks, demonstrate the framework's significant detection effectiveness. The code is available at: https://github.com/tmlr-group/Easy2Hard.


Using Synthetic Data to estimate the True Error is theoretically and practically doable

arXiv.org Artificial Intelligence

Accurately evaluating model performance is crucial for deploying machine learning systems in real-world applications. Traditional methods often require a sufficiently large labeled test set to ensure a reliable evaluation. However, in many contexts, a large labeled dataset is costly and labor-intensive. Therefore, we sometimes have to do evaluation by a few labeled samples, which is theoretically challenging. Recent advances in generative models offer a promising alternative by enabling the synthesis of high-quality data. In this work, we make a systematic investigation about the use of synthetic data to estimate the test error of a trained model under limited labeled data conditions. To this end, we develop novel generalization bounds that take synthetic data into account. Those bounds suggest novel ways to optimize synthetic samples for evaluation and theoretically reveal the significant role of the generator's quality. Inspired by those bounds, we propose a theoretically grounded method to generate optimized synthetic data for model evaluation. Experimental results on simulation and tabular datasets demonstrate that, compared to existing baselines, our method achieves accurate and more reliable estimates of the test error.


Android Malware Detection: A Machine Leaning Approach

arXiv.org Artificial Intelligence

-- This study examines machine learning techniques like Decision Trees, Support V ector Machines, Logistic Regression, Neural Networks, and ensemble methods to detect Android malware. The study evaluates these models on a dataset of Android applications and analyzes their accuracy, efficiency, and real-world applicability. Key findings show that ensemble methods demonstrate superior performance, but there are trade-offs between model interpretability, efficiency, and accuracy. Given its increasing threat, the insights guide future research and practical use of ML to combat Android malware. I. INTRODUCTION Smartphones have brought in a new era of connectivity, convenience, and innovation, with Android being the most widely used mobile operating system [1], [2]. However, this ubiquity has come with challenges. The background of Android's ecosystem makes clear that the characteristics that make Android popular also leave it vulnerable to malicious activities. Specifically, Android's open-source nature, vast user base, and easy application distribution and installation have created an environment where cybercriminals can thrive. Thus, it is essential to understand the Android ecosystem's unique landscape to address the severe threat of Android malware. The following section sets the stage for exploring advanced malware detection techniques for Android devices in later sections. A. Background The extensive adoption of Android operating systems, with their open-source nature and customization capabilities, has led to them becoming a primary target for cybercriminals. Android's vast and diverse application ecosystem presents significant security challenges, as malicious applications can masquerade as legitimate ones, exploiting vulnerabilities and employing social engineering tactics [1]-[3]. These malicious activities include stealing sensitive information, sending premium-rate SMS messages, and installing additional payloads [4]-[5].


Metadata-Aligned 3D MRI Representations for Contrast Understanding and Quality Control

arXiv.org Artificial Intelligence

Magnetic Resonance Imaging suffers from substantial data heterogeneity and the absence of standardized contrast labels across scanners, protocols, and institutions, which severely limits large-scale automated analysis. A unified representation of MRI contrast would enable a wide range of downstream utilities, from automatic sequence recognition to harmonization and quality control, without relying on manual annotations. To this end, we introduce MR-CLIP, a metadata-guided framework that learns MRI contrast representations by aligning volumetric images with their DICOM acquisition parameters. The resulting embeddings shows distinct clusters of MRI sequences and outperform supervised 3D baselines under data scarcity in few-shot sequence classification. Moreover, MR-CLIP enables unsupervised data quality control by identifying corrupted or inconsistent metadata through image-metadata embedding distances. By transforming routinely available acquisition metadata into a supervisory signal, MR-CLIP provides a scalable foundation for label-efficient MRI analysis across diverse clinical datasets.


Proactive DDoS Detection and Mitigation in Decentralized Software-Defined Networking via Port-Level Monitoring and Zero-Training Large Language Models

arXiv.org Artificial Intelligence

Centralized Software-Defined Networking (cSDN) offers flexible and programmable control of networks but suffers from scalability and reliability issues due to its reliance on centralized controllers. Decentralized SDN (dSDN) alleviates these concerns by distributing control across multiple local controllers, yet this architecture remains highly vulnerable to Distributed Denial-of-Service (DDoS) attacks. In this paper, we propose a novel detection and mitigation framework tailored for dSDN environments. The framework leverages lightweight port-level statistics combined with prompt engineering and in-context learning, enabling the DeepSeek-v3 Large Language Model (LLM) to classify traffic as benign or malicious without requiring fine-tuning or retraining. Once an anomaly is detected, mitigation is enforced directly at the attacker's port, ensuring that malicious traffic is blocked at their origin while normal traffic remains unaffected. An automatic recovery mechanism restores normal operation after the attack inactivity, ensuring both security and availability. Experimental evaluation under diverse DDoS attack scenarios demonstrates that the proposed approach achieves near-perfect detection, with 99.99% accuracy, 99.97% precision, 100% recall, 99.98% F1-score, and an AUC of 1.0. These results highlight the effectiveness of combining distributed monitoring with zero-training LLM inference, providing a proactive and scalable defense mechanism for securing dSDN infrastructures against DDoS threats.


Towards Reliable Pediatric Brain Tumor Segmentation: Task-Specific nnU-Net Enhancements

arXiv.org Artificial Intelligence

Accurate segmentation of pediatric brain tumors in multi-parametric magnetic resonance imaging (mpMRI) is critical for diagnosis, treatment planning, and monitoring, yet faces unique challenges due to limited data, high anatomical variability, and heterogeneous imaging across institutions. In this work, we present an advanced nnU-Net framework tailored for BraTS 2025 Task-6 (PED), the largest public dataset of pre-treatment pediatric high-grade gliomas. Our contributions include: (1) a widened residual encoder with squeeze-and-excitation (SE) attention; (2) 3D depthwise separable convolutions; (3) a specificity-driven regularization term; and (4) small-scale Gaussian weight initialization. We further refine predictions with two postprocessing steps. Our models achieved first place on the Task-6 validation leaderboard, attaining lesion-wise Dice scores of 0.759 (CC), 0.967 (ED), 0.826 (ET), 0.910 (NET), 0.928 (TC) and 0.928 (WT).


PADBen: A Comprehensive Benchmark for Evaluating AI Text Detectors Against Paraphrase Attacks

arXiv.org Artificial Intelligence

While AI-generated text (AIGT) detectors achieve over 90\% accuracy on direct LLM outputs, they fail catastrophically against iteratively-paraphrased content. We investigate why iteratively-paraphrased text -- itself AI-generated -- evades detection systems designed for AIGT identification. Through intrinsic mechanism analysis, we reveal that iterative paraphrasing creates an intermediate laundering region characterized by semantic displacement with preserved generation patterns, which brings up two attack categories: paraphrasing human-authored text (authorship obfuscation) and paraphrasing LLM-generated text (plagiarism evasion). To address these vulnerabilities, we introduce PADBen, the first benchmark systematically evaluating detector robustness against both paraphrase attack scenarios. PADBen comprises a five-type text taxonomy capturing the full trajectory from original content to deeply laundered text, and five progressive detection tasks across sentence-pair and single-sentence challenges. We evaluate 11 state-of-the-art detectors, revealing critical asymmetry: detectors successfully identify the plagiarism evasion problem but fail for the case of authorship obfuscation. Our findings demonstrate that current detection approaches cannot effectively handle the intermediate laundering region, necessitating fundamental advances in detection architectures beyond existing semantic and stylistic discrimination methods. For detailed code implementation, please see https://github.com/JonathanZha47/PadBen-Paraphrase-Attack-Benchmark.