Performance Analysis
SecureFixAgent: A Hybrid LLM Agent for Automated Python Static Vulnerability Repair
Gajjar, Jugal, Subramaniakuppusamy, Kamalasankari, Puthal, Relsy, Ranaware, Kaustik
Modern software development pipelines face growing challenges in securing large codebases with extensive dependencies. Static analysis tools like Bandit are effective at vulnerability detection but suffer from high false positives and lack repair capabilities. Large Language Models (LLMs), in contrast, can suggest fixes but often hallucinate changes and lack self-validation. We present SecureFixAgent, a hybrid repair framework integrating Bandit with lightweight local LLMs (<8B parameters) in an iterative detect-repair-validate loop. To improve precision, we apply parameter-efficient LoRA-based fine-tuning on a diverse, curated dataset spanning multiple Python project domains, mitigating dataset bias and reducing unnecessary edits. SecureFixAgent uses Bandit for detection, the LLM for candidate fixes with explanations, and Bandit re-validation for verification, all executed locally to preserve privacy and reduce cloud reliance. Experiments show SecureFixAgent reduces false positives by 10.8% over static analysis, improves fix accuracy by 13.51%, and lowers false positives by 5.46% compared to pre-trained LLMs, typically converging within three iterations. Beyond metrics, developer studies rate explanation quality 4.5/5, highlighting its value for human trust and adoption. By combining verifiable security improvements with transparent rationale in a resource-efficient local framework, SecureFixAgent advances trustworthy, automated vulnerability remediation for modern pipelines.
Imaging Modalities-Based Classification for Lung Cancer Detection
Ahmed, Sajim, Chaudhary, Muhammad Zain, Chaudhary, Muhammad Zohaib, Abbass, Mahmoud, Sherif, Ahmed, Mamun, Mohammad Mahbubur Rahman Khan
Abstract--Lung cancer continues to be the predominant cause of cancer-related mortality globally. This review analyzes various approaches, including advanced image processing methods, focusing on their efficacy in interpreting CT scans, chest radiographs, and biological markers. Notably, we identify critical gaps in the previous surveys, including the need for robust models that can generalize across diverse populations and imaging modalities. This comprehensive synthesis aims to serve as a foundational resource for researchers and clinicians, guiding future efforts toward more accurate and efficient lung cancer detection. Key findings reveal that 3D CNN architectures integrated with CT scans achieve the most superior performances, yet challenges such as high false positives, dataset variability, and computational complexity persist across modalities.
R-Net: A Reliable and Resource-Efficient CNN for Colorectal Cancer Detection with XAI Integration
Ayon, Rokonozzaman, Ahad, Md Taimur, Song, Bo, Li, Yan
State - of - the - art (SOTA) Convolutional Neural Networks (CNNs) are criticized for their extensive computational power, long training times, and large datasets . To overcome this limitation, we propose a reasonable network (R - Net), a lightweight CNN only to detect and classify colorectal cancer (CRC) using the Enteroscope Biopsy Histopathological Hematoxylin and Eosin Image Dataset (EBHI) . Furthermore, six SOTA CNNs, including Multipath - based CNNs (DenseNet121, ResNet50), Depth - based CNNs (InceptionV3), width - based multi - connection CNNs (Xception), depth - wise separable convolutions (MobileNetV2), spatial exploitation - based CNNs (VGG16), Transfer learning, and two ensemble models are also tested on the same dataset . The ensemble models are a multipath - depth - width combination (DenseNet121 - InceptionV3 - Xception) and a multipath - depth - spatial combination ( ResNet18 - InceptionV3 - VGG16) . However, the proposed R - Net lightweight achieved 99.37% accuracy, outperforming MobileNet ( 95.83%) and ResNet50 ( 96.94%). Most importantly, to understand the decision - making of R - Net, Explainable AI such as SHAP, LIME, and Grad - CAM are integrated to visualize which parts of the EBHI image contribute to the detection and classification process of R - Net . The main novelty of this research lies in building a reliable, lightweight CNN R - Net that requires fewer computing resources yet maintains strong prediction results. SOTA CNN s, transfer learning, and ensemble models also extend our knowledge on CRC classification and detection. XAI functionality and the impact of pixel intensity on correct and incorrect classification images are also some novelties in CRC detection and classification.
A study on Deep Convolutional Neural Networks, transfer learning, and Mnet model for Cervical Cancer Detection
Sagor, Saifuddin, Ahad, Md Taimur, Ahmed, Faruk, Ayon, Rokonozzaman, Parvin, Sanzida
Early and accurate detection through Pap smear analysis is critical to improving patient outcomes and reducing mortality of Cervical cancer. State-of-the-art (SOTA) Convolutional Neural Networks (CNNs) require substantial computational resources, extended training time, and large datasets. In this study, a lightweight CNN model, S-Net (Simple Net), is developed specifically for cervical cancer detection and classification using Pap smear images to address these limitations. Alongside S-Net, six SOTA CNNs were evaluated using transfer learning, including multi-path (DenseNet201, ResNet152), depth-based (Serasnet152), width-based multi-connection (Xception), depth-wise separable convolutions (MobileNetV2), and spatial exploitation-based (VGG19). All models, including S-Net, achieved comparable accuracy, with S-Net reaching 99.99%. However, S-Net significantly outperforms the SOTA CNNs in terms of computational efficiency and inference time, making it a more practical choice for real-time and resource-constrained applications. A major limitation in CNN-based medical diagnosis remains the lack of transparency in the decision-making process. To address this, Explainable AI (XAI) techniques, such as SHAP, LIME, and Grad-CAM, were employed to visualize and interpret the key image regions influencing model predictions. The novelty of this study lies in the development of a highly accurate yet computationally lightweight model (S-Net) caPable of rapid inference while maintaining interpretability through XAI integration. Furthermore, this work analyzes the behavior of SOTA CNNs, investigates the effects of negative transfer learning on Pap smear images, and examines pixel intensity patterns in correctly and incorrectly classified samples.
A Data-Driven Discretized CS:GO Simulation Environment to Facilitate Strategic Multi-Agent Planning Research
Wang, Yunzhe, Ustun, Volkan, McGroarty, Chris
Using Counter-Strike: Global Offensive (CS:GO) as a testbed, our framework accurately simulates gameplay using only movement decisions as tactical positioning--without explicitly modeling low-level mechanics such as aiming and shooting. Central to our approach is a waypoint system that simplifies and discretizes continuous states and actions, paired with neural predictive and generative models trained on real CS:GO tournament data to reconstruct event outcomes. Extensive evaluations show that replays generated from human data in DECOY closely match those observed in the original game. Our publicly available simulation environment provides a valuable tool for advancing research in strategic multi-agent planning and behavior generation. 1 INTRODUCTION Team-based multiplayer strategy games have emerged as grand challenge domains for multi-agent learning and long-horizon planning. Breakthroughs in complex games such as StarCraft II and Dota 2--where AI agents have achieved human-expert or even superhuman performance through large-scale self-play training (Baker et al. 2019; Vinyals et al. 2019; Berner et al. 2019; Open Ended Learning Team et al. 2021)--demonstrate that, given sufficient simulation and training, sophisticated strategies can be discovered even in environments characterized by long time horizons, imperfect information, and high-dimensional state-action spaces. However, these successes come at the expense of enormous computational costs, and a prevailing trend in the field is to scale performance by increasing model size and training on larger datasets with more simulation steps (Neumann and Gros 2022; Obando-Ceron et al. 2024; Kaplan et al. 2020).
Medical priority fusion: achieving dual optimization of sensitivity and interpretability in nipt anomaly detection
Ge, Xiuqi, Yao, Zhibo, Du, Yaosong
Clinical machine learning faces a critical dilemma in high-stakes medical applications: algorithms achieving optimal diagnostic performance typically sacrifice the interpretability essential for physician decision-making, while interpretable methods compromise sensitivity in complex scenarios. This paradox becomes particularly acute in non-invasive prenatal testing (NIPT), where missed chromosomal abnormalities carry profound clinical consequences yet regulatory frameworks mandate explainable AI systems. We introduce Medical Priority Fusion (MPF), a constrained multi-objective optimization framework that resolves this fundamental trade-off by systematically integrating Naive Bayes probabilistic reasoning with Decision Tree rule-based logic through mathematically-principled weighted fusion under explicit medical constraints. Rigorous validation on 1,687 real-world NIPT samples characterized by extreme class imbalance (43.4:1 normal-to-abnormal ratio) employed stratified 5-fold cross-validation with comprehensive ablation studies and statistical hypothesis testing using McNemar's paired comparisons. MPF achieved simultaneous optimization of dual objectives: 89.3% sensitivity (95% CI: 83.9-94.7%) with 80% interpretability score, significantly outperforming individual algorithms (McNemar's test, p < 0.001). The optimal fusion configuration achieved Grade A clinical deployment criteria with large effect size (d = 1.24), establishing the first clinically-deployable solution that maintains both diagnostic accuracy and decision transparency essential for prenatal care. This work demonstrates that medical-constrained algorithm fusion can resolve the interpretability-performance trade-off, providing a mathematical framework for developing high-stakes medical decision support systems that meet both clinical efficacy and explainability requirements.
DIVERS-Bench: Evaluating Language Identification Across Domain Shifts and Code-Switching
Ojo, Jessica, Kamel, Zina, Adelani, David Ifeoluwa
Language Identification (LID) is a core task in multilingual NLP, yet current systems often overfit to clean, monolingual data. This work introduces DIVERS-BENCH, a comprehensive evaluation of state-of-the-art LID models across diverse domains, including speech transcripts, web text, social media texts, children's stories, and code-switched text. Our findings reveal that while models achieve high accuracy on curated datasets, performance degrades sharply on noisy and informal inputs. We also introduce DIVERS-CS, a diverse code-switching benchmark dataset spanning 10 language pairs, and show that existing models struggle to detect multiple languages within the same sentence. These results highlight the need for more robust and inclusive LID systems in real-world settings.
WISE: Weak-Supervision-Guided Step-by-Step Explanations for Multimodal LLMs in Image Classification
Jiang, Yiwen, Mehta, Deval, Yan, Siyuan, Shen, Yaling, Wang, Zimu, Ge, Zongyuan
Multimodal Large Language Models (MLLMs) have shown promise in visual-textual reasoning, with Multimodal Chain-of-Thought (MCoT) prompting significantly enhancing interpretability. However, existing MCoT methods rely on rationale-rich datasets and largely focus on inter-object reasoning, overlooking the intra-object understanding crucial for image classification. To address this gap, we propose WISE, a Weak-supervision-guided Step-by-step Explanation method that augments any image classification dataset with MCoTs by reformulating the concept-based representations from Concept Bottleneck Models (CBMs) into concise, interpretable reasoning chains under weak supervision. Experiments across ten datasets show that our generated MCoTs not only improve interpretability by 37% but also lead to gains in classification accuracy when used to fine-tune MLLMs. Our work bridges concept-based interpretability and generative MCoT reasoning, providing a generalizable framework for enhancing MLLMs in fine-grained visual understanding.
Predicting Chest Radiograph Findings from Electrocardiograms Using Interpretable Machine Learning
Matejas, Julia, ลปurawski, Olaf, Strodthoff, Nils, Alcaraz, Juan Miguel Lopez
Purpose: Chest X-rays are essential for diagnosing pulmonary conditions, but limited access in resource-constrained settings can delay timely diagnosis. Electrocardiograms (ECGs), in contrast, are widely available, non-invasive, and often acquired earlier in clinical workflows. This study aims to assess whether ECG features and patient demographics can predict chest radiograph findings using an interpretable machine learning approach. Methods: Using the MIMIC-IV database, Extreme Gradient Boosting (XGBoost) classifiers were trained to predict diverse chest radiograph findings from ECG-derived features and demographic variables. Recursive feature elimination was performed independently for each target to identify the most predictive features. Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC) with bootstrapped 95% confidence intervals. Shapley Additive Explanations (SHAP) were applied to interpret feature contributions. Results: Models successfully predicted multiple chest radiograph findings with varying accuracy. Feature selection tailored predictors to each target, and including demographic variables consistently improved performance. SHAP analysis revealed clinically meaningful contributions from ECG features to radiographic predictions. Conclusion: ECG-derived features combined with patient demographics can serve as a proxy for certain chest radiograph findings, enabling early triage or pre-screening in settings where radiographic imaging is limited. Interpretable machine learning demonstrates potential to support radiology workflows and improve patient care.
PRNU-Bench: A Novel Benchmark and Model for PRNU-Based Camera Identification
Croitoru, Florinel Alin, Hondru, Vlad, Ionescu, Radu Tudor
We propose a novel benchmark for camera identification via Photo Response Non-Uniformity (PRNU) estimation. The benchmark comprises 13K photos taken with 120+ cameras, where the training and test photos are taken in different scenarios, enabling ``in-the-wild'' evaluation. In addition, we propose a novel PRNU-based camera identification model that employs a hybrid architecture, comprising a denoising autoencoder to estimate the PRNU signal and a convolutional network that can perform 1:N verification of camera devices. Instead of using a conventional approach based on contrastive learning, our method takes the Hadamard product between reference and query PRNU signals as input. This novel design leads to significantly better results compared with state-of-the-art models based on denoising autoencoders and contrastive learning. We release our dataset and code at: https://github.com/CroitoruAlin/PRNU-Bench.