AITopics

Fraud detection remains a critical task in high-stakes domains such as finance and e-commerce, where undetected fraudulent transactions can lead to significant economic losses. In this study, we systematically compare the performance of four supervised learning models - Logistic Regression, Random Forest, Light Gradient Boosting Machine (LightGBM), and a Gated Recurrent Unit (GRU) network - on a large-scale, highly imbalanced online transaction dataset. While ensemble methods such as Random Forest and LightGBM demonstrated superior performance in both overall and class-specific metrics, Logistic Regression offered a reliable and interpretable baseline. The GRU model showed strong recall for the minority fraud class, though at the cost of precision, highlighting a trade-off relevant for real-world deployment. Our evaluation emphasizes not only weighted averages but also per-class precision, recall, and F1-scores, providing a nuanced view of each model's effectiveness in detecting rare but consequential fraudulent activity. The findings underscore the importance of choosing models based on the specific risk tolerance and operational needs of fraud detection systems.

artificial intelligence, logistic regression, machine learning, (17 more...)

2505.22521

Country: North America > United States (0.68)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)

AnoF-Diff: One-Step Diffusion-Based Anomaly Detection for Forceful Tool Use

Lin, Yating, Huang, Zixuan, Yang, Fan, Berenson, Dmitry

Abstract-- Multivariate time-series anomaly detection, which is critical for identifying unexpected events, has been explored in the field of machine learning for several decades. However, directly applying these methods to data from forceful tool use tasks is challenging because streaming sensor data in the real world tends to be inherently noisy, exhibits non-stationary behavior, and varies across different tasks and tools. T o address these challenges, we propose a method, AnoF-Diff, based on the diffusion model to extract force-torque features from time-series data and use force-torque features to detect anomalies. We compare our method with other state-of-the-art methods in terms of F1-score and Area Under the Receiver Operating Characteristic curve (AUROC) on four forceful tool-use tasks, demonstrating that our method has better performance and is more robust to a noisy dataset. We also propose the method of parallel anomaly score evaluation based on one-step diffusion and demonstrate how our method can be used for online anomaly detection in several forceful tool use experiments. I. INTRODUCTION As the development of robot sensing and machine learning technologies accelerates, multivariate time series analysis is becoming more and more critical in the robotics field. Robotic systems usually rely on current time-step sensor data for decision-making and control, which makes it possible to miss the potential temporal patterns over multiple time steps. Additionally, some sensor data, such as force-torque signals, require multiple time steps to capture dynamic behaviors and provide meaningful information.

artificial intelligence, data mining, machine learning, (18 more...)

2509.15153

Country: North America > United States (0.28)

Genre: Research Report (0.86)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Rafferty, Amy, Ramaesh, Rishi, Rajan, Ajitha

Limitations of Public Chest Radiography Datasets for Artificial Intelligence: Label Quality, Domain Shift, Bias and Evaluation Challenges

Artificial intelligence has shown significant promise in chest radiography, where deep learning models can approach radiologist-level diagnostic performance. Progress has been accelerated by large public datasets such as MIMIC-CXR, ChestX-ray14, PadChest, and CheXpert, which provide hundreds of thousands of labelled images with pathology annotations. However, these datasets also present important limitations. Automated label extraction from radiology reports introduces errors, particularly in handling uncertainty and negation, and radiologist review frequently disagrees with assigned labels. In addition, domain shift and population bias restrict model generalisability, while evaluation practices often overlook clinically meaningful measures. We conduct a systematic analysis of these challenges, focusing on label quality, dataset bias, and domain shift. Our cross-dataset domain shift evaluation across multiple model architectures revealed substantial external performance degradation, with pronounced reductions in AUPRC and F1 scores relative to internal testing. To assess dataset bias, we trained a source-classification model that distinguished datasets with near-perfect accuracy, and performed subgroup analyses showing reduced performance for minority age and sex groups. Finally, expert review by two board-certified radiologists identified significant disagreement with public dataset labels. Our findings highlight important clinical weaknesses of current benchmarks and emphasise the need for clinician-validated datasets and fairer evaluation frameworks.

artificial intelligence, machine learning, natural language, (16 more...)

2509.15107

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Popova, Iva, Gardi, Hamza A. A.

Credit Card Fraud Detection

Iva Popova Hamza A. A. Gardi ETIT - KIT, Germany IIIT at ETIT - KIT, Germany Abstract Credit card fraud remains a significant challenge due to class imbalance and fraudsters mimicking legitimate behavior. This study evaluates five machine learning models - Logistic Regression, Random Forest, XGBoost, K - Nearest Neighbors (KNN), and Multi - Lay er Perceptron (MLP) on a real - world dataset using undersampling, SMOTE, and a hybrid approach. Our models are evaluated on the original imbalanced test set to better reflect real - world performance. Results show that the hybrid method achieves the best bala nce between recall and precision, especially improving MLP and KNN performance. I ntroduction Financial fraud is a significant issue that has been continuously increasing over the past few years due to the ever - growing volume of online transactions conduc ted with credit cards. Credit card fraud (CCF) refers to a type of fraud in which an individual other than the cardholder unlawfully conducts transactions using a card that is stolen, lost, or otherwise misused [ 1 ]. CCF has resulted in billions of dollars in losses for banks and other online payment platforms. According to the Federal Trade Commission (FTC), there were 449,076 reports of CCF in 2024, representing a 7.8% increase from the previous year [ 2 ]. Given this trend, new methods must be employed to c apture patterns and dependencies in the data.

artificial intelligence, dataset, machine learning, (16 more...)

2509.15044

Country: Europe > Germany (0.44)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Tessema, Amsalu, Bayih, Tizazu, Azezew, Kassahun, Kassie, Ayenew

Data-Driven Prediction of Maternal Nutritional Status in Ethiopia Using Ensemble Machine Learning Models

Malnutrition among pregnant women is a major public health challenge in Ethiopia, increasing the risk of adverse maternal and neonatal outcomes. Traditional statistical approaches often fail to capture the complex and multidimensional determinants of nutritional status. This study develops a predictive model using ensemble machine learning techniques, leveraging data from the Ethiopian Demographic and Health Survey (2005-2020), comprising 18,108 records with 30 socio-demographic and health attributes. Data preprocessing included handling missing values, normalization, and balancing with SMOTE, followed by feature selection to identify key predictors. Several supervised ensemble algorithms including XGBoost, Random Forest, CatBoost, and AdaBoost were applied to classify nutritional status. Among them, the Random Forest model achieved the best performance, classifying women into four categories (normal, moderate malnutrition, severe malnutrition, and overnutrition) with 97.87% accuracy, 97.88% precision, 97.87% recall, 97.87% F1-score, and 99.86% ROC AUC. These findings demonstrate the effectiveness of ensemble learning in capturing hidden patterns from complex datasets and provide timely insights for early detection of nutritional risks. The results offer practical implications for healthcare providers, policymakers, and researchers, supporting data-driven strategies to improve maternal nutrition and health outcomes in Ethiopia.

artificial intelligence, ethiopia, machine learning, (14 more...)

2509.14945

Country:

Africa > Ethiopia (0.94)
Asia (0.94)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (1.00)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Nowosadko, Konrad, Ruggeri, Franco, Terra, Ahmad

Self-Explaining Reinforcement Learning for Mobile Network Resource Allocation

Abstract--Reinforcement Learning (RL) methods that incorporate deep neural networks (DNN), though powerful, often lack transparency. Their black-box characteristic hinders inter-pretability and reduces trustworthiness, particularly in critical domains. T o address this challenge in RL tasks, we propose a solution based on Self-Explaining Neural Networks (SENNs) along with explanation extraction methods to enhance inter-pretability while maintaining predictive accuracy. Our approach targets low-dimensionality problems to generate robust local and global explanations of the model's behaviour . We evaluate the proposed method on the resource allocation problem in mobile networks, demonstrating that SENNs can constitute interpretable solutions with competitive performance. This work highlights the potential of SENNs to improve transparency and trust in AIdriven decision-making for low-dimensional tasks. Interest in Explainable Artificial Intelligance (XAI) has been rapidly growing, facilitated by the need for transparency. Although powerful, Deep Neural Networks (DNNs) models often operate as black boxes, making it difficult to interpret their decisions, leading to a lack of trust among stakeholders and consequently hindering their applicability.

explanation, machine learning, reinforcement learning, (16 more...)

2509.14925

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Kurek, Izabela, Trejter, Wojciech, Frkovic, Stipe, Erdelez, Andro

[Re] Improving Interpretation Faithfulness for Vision Transformers

This work aims to reproduce the results of Faithful Vision Transformers (FViTs) proposed by Hu et al. (2024) alongside interpretability methods for Vision Transformers from Chefer et al. (2021) and Xu et al. (2022). We investigate claims made by Hu et al. (2024), namely that the usage of Diffusion Denoised Smoothing (DDS) improves interpretability robustness to (1) attacks in a segmentation task and (2) perturbation and attacks in a classification task. We also extend the original study by investigating the authors' claims that adding DDS to any interpretability method can improve its robustness under attack. This is tested on baseline methods and the recently proposed Attribution Rollout method.

artificial intelligence, machine learning research, robustness, (16 more...)

2509.14846

Country: Europe > Switzerland (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

SynBench: A Benchmark for Differentially Private Text Generation

Sun, Yidan, Schlegel, Viktor, Nandakumar, Srinivasan, Zahid, Iqra, Wu, Yuping, Wu, Yulong, Li, Hao, Zhang, Jie, Del-Pinto, Warren, Nenadic, Goran, Lam, Siew Kei, Bharath, Anil Anthony

Data-driven decision support in high-stakes domains like healthcare and finance faces significant barriers to data sharing due to regulatory, institutional, and privacy concerns. While recent generative AI models, such as large language models, have shown impressive performance in open-domain tasks, their adoption in sensitive environments remains limited by unpredictable behaviors and insufficient privacy-preserving datasets for benchmarking. Existing anonymization methods are often inadequate, especially for unstructured text, as redaction and masking can still allow re-identification. Differential Privacy (DP) offers a principled alternative, enabling the generation of synthetic data with formal privacy assurances. In this work, we address these challenges through three key contributions. First, we introduce a comprehensive evaluation framework with standardized utility and fidelity metrics, encompassing nine curated datasets that capture domain-specific complexities such as technical jargon, long-context dependencies, and specialized document structures. Second, we conduct a large-scale empirical study benchmarking state-of-the-art DP text generation methods and LLMs of varying sizes and different fine-tuning strategies, revealing that high-quality domain-specific synthetic data generation under DP constraints remains an unsolved challenge, with performance degrading as domain complexity increases. Third, we develop a membership inference attack (MIA) methodology tailored for synthetic text, providing first empirical evidence that the use of public datasets - potentially present in pre-training corpora - can invalidate claimed privacy guarantees. Our findings underscore the urgent need for rigorous privacy auditing and highlight persistent gaps between open-domain and specialist evaluations, informing responsible deployment of generative AI in privacy-sensitive, high-stakes settings.

large language model, machine learning, natural language, (16 more...)

2509.14594

Country:

North America (0.46)
Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)

ALIGNS: Unlocking nomological networks in psychological measurement through a large language model

Larsen, Kai R., Yan, Sen, Mueller, Roland M., Sang, Lan, Rönkkö, Mikko, Starzl, Ravi, Edmondson, Donald

Psychological measurement is critical to many disciplines. Despite advances in measurement, building nomological networks, theoretical maps of how concepts and measures relate to establish validity, remains a challenge 70 years after Cronbach and Meehl proposed them as fundamental to validation. This limitation has practical consequences: clinical trials may fail to detect treatment effects, and public policy may target the wrong outcomes. We introduce Analysis of Latent Indicators to Generate Nomological Structures (ALIGNS), a large language model-based system trained with validated questionnaire measures. ALIGNS provides three comprehensive nomological networks containing over 550,000 indicators across psychology, medicine, social policy, and other fields. This represents the first application of large language models to solve a foundational problem in measurement validation. We report classification accuracy tests used to develop the model, as well as three evaluations. In the first evaluation, the widely used NIH PROMIS anxiety and depression instruments are shown to converge into a single dimension of emotional distress. The second evaluation examines child temperament measures and identifies four potential dimensions not captured by current frameworks, and questions one existing dimension. The third evaluation, an applicability check, engages expert psychometricians who assess the system's importance, accessibility, and suitability. ALIGNS is freely available at nomologicalnetwork.org, complementing traditional validation methods with large-scale nomological analysis.

large language model, machine learning, natural language, (20 more...)

2509.09723

Country:

North America > United States (1.00)
Europe (0.93)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.88)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.49)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Wiedeman, Christopher, Sarmakeeva, Anastasiia, Sizikova, Elena, Filienko, Daniil, Lago, Miguel, Delfino, Jana G., Badano, Aldo

T-SYNTH: A Knowledge-Based Dataset of Synthetic Breast Images

Responsible for approximately two million new cases and over six hundred thousand deaths in 2022 alone (Sung et al., 2021), breast cancer remains a prominent global health concern, and is expected to account nearly one-third of all newly diagnosed cancers among women in the United States (DeSantis et al., 2016). According to the most recent report from International Agency for Research on Cancer (Bray et al., 2024), it is one of the most widespread cancers diagnosed worldwide, both in the number of cases and associated deaths. Consequently, medical imaging techniques are indispensable for screening, diagnosis, and further research into the disease. Historically, the most common imaging technique for breast cancer screening is digital mammography (DM), in which a 2D x-ray projection of a compressed breast is taken. Digital breast tomosynthesis (DBT), a pseudo-3D imaging technique, has been increasingly adopted, demonstrating improved screening performance (Asbeutah et al., 2019; Sprague et al., 2023).

artificial intelligence, deep learning, machine learning, (17 more...)

2507.04038

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.92)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)