Performance Analysis
Multi-pathology Chest X-ray Classification with Rejection Mechanisms
Aperstein, Yehudit, Tzahar, Amit, Gottlib, Alon, Verber, Tal, Damti, Ravit Shagan, Apartsin, Alexander
Overconfidence in deep learning models poses a significant risk in high - stakes medical imaging tasks, particularly in multi - label classification of chest X - rays, where multiple co - occurring pathologies must be detected simultaneously. This study introduces an uncertainty - aware framework for chest X - ray diagnosis based on a DenseNet - 121 backbone, enhanced with two selective prediction mechanisms: entropy - based rejection and confidence interval - based rejection. Both methods enable the model to abstain from un certain predictions, improving reliability by deferring ambiguous cases to clinical experts. A quantile - based calibration procedure is employed to tune rejection thresholds using either global or class - specific strategies. Experiments conducted on three la rge public datasets (PadChest, NIH ChestX - ray14, and MIMIC - CXR) demonstrate that selective rejection improves the trade - off between diagnostic accuracy and coverage, with entropy - based rejection yielding the highest average A U C across all pathologies. Thes e results support the integration of selective prediction into AI - assisted diagnostic workflows, providing a practical step toward safer, uncertainty - aware deployment of deep learning in clinical settings. Automating medical diagnosis with deep learning has shown great potential, particularly in medical imaging domains such as chest X - ray analysis. Convolutional neural networks, including architectures like DenseNet - 121, have demonstrated strong performance in detecting a range of thoracic pathologies [1],[2] . However, successfully integrating such models into clinical workflows requires more than high classification accuracy . I t demands robust mechanisms for managing uncertainty and ensuring patient safety. 2 Figure 1: Overview of the proposed selective chest X - ray classification framework .
Data-Driven Energy Estimation for Virtual Servers Using Combined System Metrics and Machine Learning
This paper presents a machine learning-based approach to estimate the energy consumption of virtual servers without access to physical power measurement interfaces. Using resource utilization metrics collected from guest virtual machines, we train a Gradient Boosting Regressor to predict energy consumption measured via RAPL on the host. We demonstrate, for the first time, guest-only resource-based energy estimation without privileged host access with experiments across diverse workloads, achieving high predictive accuracy and variance explained ($0.90 \leq R^2 \leq 0.97$), indicating the feasibility of guest-side energy estimation. This approach can enable energy-aware scheduling, cost optimization and physical host independent energy estimates in virtualized environments. Our approach addresses a critical gap in virtualized environments (e.g. cloud) where direct energy measurement is infeasible.
Detection of Anomalous Behavior in Robot Systems Based on Machine Learning
Nissan, Mahfuzul I., Aktar, Sharmin
Ensuring the safe and reliable operation of robotic systems is paramount to prevent potential disasters and safeguard human well-being. Despite rigorous design and engineering practices, these systems can still experience malfunctions, leading to safety risks. In this study, we present a machine learning-based approach for detecting anomalies in system logs to enhance the safety and reliability of robotic systems. We collected logs from two distinct scenarios using CoppeliaSim and comparatively evaluated several machine learning models, including Logistic Regression (LR), Support Vector Machine (SVM), and an Autoencoder. Our system was evaluated in a quadcopter context (Context 1) and a Pioneer robot context (Context 2). Results showed that while LR demonstrated superior performance in Context 1, the Autoencoder model proved to be the most effective in Context 2. This highlights that the optimal model choice is context-dependent, likely due to the varying complexity of anomalies across different robotic platforms. This research underscores the value of a comparative approach and demonstrates the particular strengths of autoencoders for detecting complex anomalies in robotic systems.
HGEN: Heterogeneous Graph Ensemble Networks
Shen, Jiajun, Jin, Yufei, He, Yi, Zhu, Xingquan
This paper presents HGEN that pioneers ensemble learning for heterogeneous graphs. We argue that the heterogeneity in node types, nodal features, and local neighborhood topology poses significant challenges for ensemble learning, particularly in accommodating diverse graph learners. Our HGEN framework ensembles multiple learners through a meta-path and transformation-based optimization pipeline to uplift classification accuracy. Specifically, HGEN uses meta-path combined with random dropping to create Allele Graph Neural Networks (GNNs), whereby the base graph learners are trained and aligned for later ensembling. To ensure effective ensemble learning, HGEN presents two key components: 1) a residual-attention mechanism to calibrate allele GNNs of different meta-paths, thereby enforcing node embeddings to focus on more informative graphs to improve base learner accuracy, and 2) a correlation-regularization term to enlarge the disparity among embedding matrices generated from different meta-paths, thereby enriching base learner diversity. We analyze the convergence of HGEN and attest its higher regularization magnitude over simple voting. Experiments on five heterogeneous networks validate that HGEN consistently outperforms its state-of-the-art competitors by substantial margin.
A Co-Training Semi-Supervised Framework Using Faster R-CNN and YOLO Networks for Object Detection in Densely Packed Retail Images
Yazdanjouei, Hossein, Mansouri, Arash, Shokouhifar, Mohammad
Abstract: This study proposes a semi-supervised co-training framework for object detection in densely packed retail environments, where limited labe led data and complex conditions pose major challenges. The framework combines Faster R-CNN (utilizing a ResNe t backbone) for precise localiza tion with YOLO (employing a Darknet backbone) for global context, enabling mutual pseudo-label exchange that impr oves accuracy in scenes with occlusion and overlapping objects. To strengthe n classification, it employs a n ensemble of XGBoost, Random Forest, and SVM, utilizing diverse feature representations for higher robustness . Hyperparameters are optimized using a metaheuristic-driven algorithm, enhancing precision and efficiency across mod els. By minimizing relianc e on manual labeling, the approach reduces annotation costs and adapts effectively to fre quent product and layout changes common in retail. Experiments on the SKU-110k datase t demonstrate strong performa nce, highlighting the scal ability and practicality of the proposed framework for real-world retail applications such as automated inventory tracking, product monitoring, and checkout systems. Keywords: Retail object detection; Densely packed scenes; Semi-supervised learning; Co-training method; Faster R-CNN; Metaheuristic optim ization; YOLO integration. Detecting objects in densely pack ed retail environments has bec ome essential due to the increasing demand for automation in inventory management, product recognition, and ef ficient checkout processes in modern retail.
SPECS: Specificity-Enhanced CLIP-Score for Long Image Caption Evaluation
Chen, Xiaofu, Salazar, Israfel, Kementchedjhieva, Yova
As interest grows in generating long, detailed image captions, standard evaluation metrics become increasingly unreliable. N-gram-based metrics though efficient, fail to capture semantic correctness. Representational Similarity (RS) metrics, designed to address this, initially saw limited use due to high computational costs, while today, despite advances in hardware, they remain unpopular due to low correlation to human judgments. Meanwhile, metrics based on large language models (LLMs) show strong correlation with human judgments, but remain too expensive for iterative use during model development. We introduce SPECS (Specificity-Enhanced CLIPScore), a reference-free RS metric tailored to long image captioning. SPECS modifies CLIP with a new objective that emphasizes specificity: rewarding correct details and penalizing incorrect ones. We show that SPECS matches the performance of open-source LLM-based metrics in correlation to human judgments, while being far more efficient. This makes it a practical alternative for iterative checkpoint evaluation during image captioning model development.Our code can be found at https://github.com/mbzuai-nlp/SPECS.
Identifying Key Features for Establishing Sustainable Agro-Tourism Centre: A Data Driven Approach
Gadakh, Alka, Kumbhar, Vidya, Khosla, Sonal, Karunendra, Kumar
Agro-tourism serves as a strategic economic model designed to facilitate rural development by diversifying income streams for local communities like farmers while promoting the conservation of indigenous cultural heritage and traditional agricultural practices. As a very booming subdomain of tourism, there is a need to study the strategies for the growth of Agro-tourism in detail. The current study has identified the important indicators for the growth and enhancement of agro-tourism. The study is conducted in two phases: identification of the important indicators through a comprehensive literature review and in the second phase state-of-the-art techniques were used to identify the important indicators for the growth of agro-tourism. The indicators are also called features synonymously, the machine learning models for feature selection were applied and it was observed that the Least Absolute Shrinkage and Selection Operator (LASSO) method combined with, the machine Learning Classifiers such as Logistic Regression (LR), Decision Trees (DT), Random Forest (RF) Tree, and Extreme Gradient Boosting (XGBOOST) models were used to suggest the growth of the agro-tourism. The results show that with the LASSO method, LR model gives the highest classification accuracy of 98% in 70-30% train-test data followed by RF with 95% accuracy. Similarly, in the 80-20% train-test data LR maintains the highest accuracy at 99%, while DT and XGBoost follow with 97% accuracy.
Anti-Money Laundering Machine Learning Pipelines; A Technical Analysis on Identifying High-risk Bank Clients with Supervised Learning
Namdar, Khashayar, Wang, Pin-Chien, Raju, Tushar, Zheng, Steven, Li, Fiona, Khan, Safwat Tahmin
Anti - money laundering (AML) actions and measurements are among the priorities of financial institutions, for which machine learning (ML) has shown to have a high potential. In this paper, we propose a comprehensive and systematic approach for developing ML pipelines to identify high - risk bank clients in a dataset curated for Task 1 of the University of Toro nto 2023 - 2024 Institute for Management and Innovation (IMI) Big Data and Artificial Intelligence Competition. The dataset included 195,789 customer IDs, and we employed a 16 - step design and statistical analysis to ensure the final pipeline was robust. We also framed the data in a SQLite database, developed SQL - based feature engineering algorithms, connected our pre - trained model to the database, and made i t inference - ready, and provided explainable artificial intelligence (XAI) modules to derive feature importance. Our pipeline achieved a mean area under the receiver operating characteristic curve (AUROC) of 0.961 with a standard deviation (SD) of 0.005. Th e proposed pipeline achieved second place in the competition. Introduction In the contemporary financial landscape, money laundering represents a formidable challenge, compelling both financial institutions and regulatory bodies to seek innovative solutions. The integration of machine learning (ML) into anti - money laundering (AML) efforts has emerged as a promising avenue to enhance the detection and prevention of illicit financial activities. This paper investigates the technical considerations in employing supervised learning techniques to accurately identify high - risk bank clie nts, a critical component in the battle against money laundering. The utilization of ML for detecting money laundering transactions has shown significant promise. Jullum et al. developed an ML model that outperforms traditional systems by prioritizing transactions for manual investigation, using historic data from Norway ' s largest bank, DNB [1] .
FoundationalECGNet: A Lightweight Foundational Model for ECG-based Multitask Cardiac Analysis
Sk., Md. Sajeebul Islam, Jobayer, Md, Shawon, Md Mehedi Hasan, Alam, Md. Golam Raibul
-- Cardiovascular diseases (CVDs) remain a leading cause of mortality worldwide, underscoring the importance of accurate and scalable diagnostic systems. Electrocardiogram (ECG) analysis is central to detecting cardiac abnormalities, yet challenges such as noise, class imbalance, and dataset heterogeneity limit current methods. To address these issues, we propose Foun-dationalECGNet, a foundational framework for automated ECG classification. The model integrates a dual-stage de-noising by Morlet and Daubechies wavelets transformation, Convolutional Block Attention Module (CBAM), Graph Attention Networks (GAT), and Time Series Transformers (TST) to jointly capture spatial and temporal dependencies in multi-channel ECG signals. FoundationalECGNet first distinguishes between Normal and Abnormal ECG signals, and then classifies the Abnormal signals into one of five cardiac conditions: Arrhythmias, Conduction Disorders, Myocardial Infarction, QT Abnormalities, or Hypertrophy. Across multiple datasets, the model achieves a 99% F1-score for Normal vs. Abnormal classification and shows state-of-the-art performance in multi-class disease detection, including a 99% F1-score for Conduction Disorders and Hypertrophy, as well as a 98.9% F1-score for Arrhythmias. Additionally, the model provides risk level estimations to facilitate clinical decision-making. In conclusion, FoundationalECGNet represents a scalable, interpretable, and generalizable solution for automated ECG analysis, with the potential to improve diagnostic precision and patient outcomes in healthcare settings. ARDIOV ASCULAR diseases (CVDs) are the leading cause of death worldwide, leading to approximately 17.9 million deaths each year [45].
Effort-aware Fairness: Incorporating a Philosophy-informed, Human-centered Notion of Effort into Algorithmic Fairness Metrics
Nguyen, Tin Trung, Xu, Jiannan, Che, Zora, Nguyen-Le, Phuong-Anh, Dandamudi, Rushil, Braman, Donald, Huang, Furong, Daumé, Hal III, Jelveh, Zubin
Although popularized AI fairness metrics, e.g., demographic parity, have uncovered bias in AI-assisted decision-making outcomes, they do not consider how much effort one has spent to get to where one is today in the input feature space. However, the notion of effort is important in how Philosophy and humans understand fairness. We propose a philosophy-informed approach to conceptualize and evaluate Effort-aware Fairness (EaF), grounded in the concept of Force, which represents the temporal trajectory of predictive features coupled with inertia. Besides theoretical formulation, our empirical contributions include: (1) a pre-registered human subjects experiment, which shows that for both stages of the (individual) fairness evaluation process, people consider the temporal trajectory of a predictive feature more than its aggregate value; (2) pipelines to compute Effort-aware Individual/Group Fairness in the criminal justice and personal finance contexts. Our work may enable AI model auditors to uncover and potentially correct unfair decisions against individuals who have spent significant efforts to improve but are still stuck with systemic disadvantages outside their control.