AITopics

2504.12156

Genre: Research Report > New Finding (1.00)

Industry:

Law > Civil Rights & Constitutional Law (0.35)
Health & Medicine > Consumer Health (0.35)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningMar-13-2025

The Role of Hyperparameters in Predictive Multiplicity

Cavus, Mustafa, Woźnica, Katarzyna, Biecek, Przemysław

This paper investigates the critical role of hyperparameters in predictive multiplicity, where different machine learning models trained on the same dataset yield divergent predictions for identical inputs. These inconsistencies can seriously impact high-stakes decisions such as credit assessments, hiring, and medical diagnoses. Focusing on six widely used models for tabular data - Elastic Net, Decision Tree, k-Nearest Neighbor, Support Vector Machine, Random Forests, and Extreme Gradient Boosting - we explore how hyperparameter tuning influences predictive multiplicity, as expressed by the distribution of prediction discrepancies across benchmark datasets. Key hyperparameters such as lambda in Elastic Net, gamma in Support Vector Machines, and alpha in Extreme Gradient Boosting play a crucial role in shaping predictive multiplicity, often compromising the stability of predictions within specific algorithms. Our experiments on 21 benchmark datasets reveal that tuning these hyperparameters leads to notable performance improvements but also increases prediction discrepancies, with Extreme Gradient Boosting exhibiting the highest discrepancy and substantial prediction instability. This highlights the trade-off between performance optimization and prediction consistency, raising concerns about the risk of arbitrary predictions. These findings provide insight into how hyperparameter optimization leads to predictive multiplicity. While predictive multiplicity allows prioritizing domain-specific objectives such as fairness and reduces reliance on a single model, it also complicates decision-making, potentially leading to arbitrary or unjustified outcomes.

artificial intelligence, discrepancy, machine learning, (15 more...)

2503.13506

Country:

Europe > Poland (0.14)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.96)

arXiv.org Artificial IntelligenceFeb-16-2025

Rashomon perspective for measuring uncertainty in the survival predictive maintenance models

Yardimci, Yigitcan, Cavus, Mustafa

The prediction of the Remaining Useful Life of aircraft engines is a critical area in high-reliability sectors such as aerospace and defense. Early failure predictions help ensure operational continuity, reduce maintenance costs, and prevent unexpected failures. Traditional regression models struggle with censored data, which can lead to biased predictions. Survival models, on the other hand, effectively handle censored data, improving predictive accuracy in maintenance processes. This paper introduces a novel approach based on the Rashomon perspective, which considers multiple models that achieve similar performance rather than relying on a single best model. This enables uncertainty quantification in survival probability predictions and enhances decision-making in predictive maintenance. The Rashomon survival curve was introduced to represent the range of survival probability estimates, providing insights into model agreement and uncertainty over time. The results on the CMAPSS dataset demonstrate that relying solely on a single model for RUL estimation may increase risk in some scenarios. The censoring levels significantly impact prediction uncertainty, with longer censoring times leading to greater variability in survival probabilities. These findings underscore the importance of incorporating model multiplicity in predictive maintenance frameworks to achieve more reliable and robust failure predictions. This paper contributes to uncertainty quantification in RUL prediction and highlights the Rashomon perspective as a powerful tool for predictive modeling.

artificial intelligence, machine learning, survival model, (16 more...)

2502.15772

Country:

Asia > Middle East > Republic of Türkiye (0.16)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Aerospace & Defense (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

arXiv.org Machine LearningDec-15-2024

datadriftR: An R Package for Concept Drift Detection in Predictive Models

Dar, Ugur, Cavus, Mustafa

Predictive models often face performance degradation due to evolving data distributions, a phenomenon known as data drift. Among its forms, concept drift, where the relationship between explanatory variables and the response variable changes, is particularly challenging to detect and adapt to. Traditional drift detection methods often rely on metrics such as accuracy or variable distributions, which may fail to capture subtle but significant conceptual changes. This paper introduces drifter, an R package designed to detect concept drift, and proposes a novel method called Profile Drift Detection (PDD) that enables both drift detection and an enhanced understanding of the cause behind the drift by leveraging an explainable AI tool - Partial Dependence Profiles (PDPs). The PDD method, central to the package, quantifies changes in PDPs through novel metrics, ensuring sensitivity to shifts in the data stream without excessive computational costs. This approach aligns with MLOps practices, emphasizing model monitoring and adaptive retraining in dynamic environments. The experiments across synthetic and real-world datasets demonstrate that PDD outperforms existing methods by maintaining high accuracy while effectively balancing sensitivity and stability. The results highlight its capability to adaptively retrain models in dynamic environments, making it a robust tool for real-time applications. The paper concludes by discussing the advantages, limitations, and future extensions of the package for broader use cases.

artificial intelligence, data mining, machine learning, (20 more...)

2412.11308

Country: Europe (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

arXiv.org Machine LearningDec-12-2024

Investigating the Impact of Balancing, Filtering, and Complexity on Predictive Multiplicity: A Data-Centric Perspective

Cavus, Mustafa, Biecek, Przemyslaw

The Rashomon effect presents a significant challenge in model selection. It occurs when multiple models achieve similar performance on a dataset but produce different predictions, resulting in predictive multiplicity. This is especially problematic in high-stakes environments, where arbitrary model outcomes can have serious consequences. Traditional model selection methods prioritize accuracy and fail to address this issue. Factors such as class imbalance and irrelevant variables further complicate the situation, making it harder for models to provide trustworthy predictions. Data-centric AI approaches can mitigate these problems by prioritizing data optimization, particularly through preprocessing techniques. However, recent studies suggest preprocessing methods may inadvertently inflate predictive multiplicity. This paper investigates how data preprocessing techniques like balancing and filtering methods impact predictive multiplicity and model stability, considering the complexity of the data. We conduct the experiments on 21 real-world datasets, applying various balancing and filtering techniques, and assess the level of predictive multiplicity introduced by these methods by leveraging the Rashomon effect. Additionally, we examine how filtering techniques reduce redundancy and enhance model generalization. The findings provide insights into the relationship between balancing methods, data complexity, and predictive multiplicity, demonstrating how data-centric AI strategies can improve model performance.

artificial intelligence, machine learning, predictive multiplicity, (17 more...)

2412.09712

Country: Asia > Thailand (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.93)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Artificial IntelligenceJul-14-2024

Explainable bank failure prediction models: Counterfactual explanations to reduce the failure risk

Gunonu, Seyma, Altun, Gizem, Cavus, Mustafa

The accuracy and understandability of bank failure prediction models are crucial. While interpretable models like logistic regression are favored for their explainability, complex models such as random forest, support vector machines, and deep learning offer higher predictive performance but lower explainability. These models, known as black boxes, make it difficult to derive actionable insights. To address this challenge, using counterfactual explanations is suggested. These explanations demonstrate how changes in input variables can alter the model output and suggest ways to mitigate bank failure risk. The key challenge lies in selecting the most effective method for generating useful counterfactuals, which should demonstrate validity, proximity, sparsity, and plausibility. The paper evaluates several counterfactual generation methods: WhatIf, Multi Objective, and Nearest Instance Counterfactual Explanation, and also explores resampling methods like undersampling, oversampling, SMOTE, and the cost sensitive approach to address data imbalance in bank failure prediction in the US. The results indicate that the Nearest Instance Counterfactual Explanation method yields higher quality counterfactual explanations, mainly using the cost sensitive approach. Overall, the Multi Objective Counterfactual and Nearest Instance Counterfactual Explanation methods outperform others regarding validity, proximity, and sparsity metrics, with the cost sensitive approach providing the most desirable counterfactual explanations. These findings highlight the variability in the performance of counterfactual generation methods across different balancing strategies and machine learning models, offering valuable strategies to enhance the utility of black box bank failure prediction models.

artificial intelligence, machine learning, natural language, (14 more...)

2407.11089

Country:

Europe (0.68)
North America > United States (0.48)
Asia > Middle East > Republic of Türkiye (0.14)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
(2 more...)

arXiv.org Artificial IntelligenceJun-24-2024

An Experimental Study on the Rashomon Effect of Balancing Methods in Imbalanced Classification

Cavus, Mustafa, Biecek, Przemysław

Predictive models may generate biased predictions when classifying imbalanced datasets. This happens when the model favors the majority class, leading to low performance in accurately predicting the minority class. To address this issue, balancing or resampling methods are critical pre-processing steps in the modeling process. However, there have been debates and questioning of the functionality of these methods in recent years. In particular, many candidate models may exhibit very similar predictive performance, which is called the Rashomon effect, in model selection. Selecting one of them without considering predictive multiplicity which is the case of yielding conflicting models' predictions for any sample may lead to a loss of using another model. In this study, in addition to the existing debates, the impact of balancing methods on predictive multiplicity is examined through the Rashomon effect. It is important because the blind model selection is risky from a set of approximately equally accurate models. This may lead to serious problems in model selection, validation, and explanation. To tackle this matter, we conducted real dataset experiments to observe the impact of balancing methods on predictive multiplicity through the Rashomon effect. Our findings showed that balancing methods inflate the predictive multiplicity, and they yield varying results. To monitor the trade-off between performance and predictive multiplicity for conducting the modeling process responsibly, we proposed using the extended performance-gain plot for the Rashomon effect.

artificial intelligence, machine learning, modeling & simulation, (15 more...)

2405.01557

Country:

Europe > Poland (0.14)
Asia > Middle East > Republic of Türkiye (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.76)

arXiv.org Machine LearningAug-29-2023

Glocal Explanations of Expected Goal Models in Soccer

Cavus, Mustafa, Stando, Adrian, Biecek, Przemyslaw

In soccer, it is not uncommon for one team to dominate a match, creating many chances to score but failing to do so, while the opposing team manages to convert one of their few chances into a goal and win the match. Thus, the use of traditional end-of-match statistics is often argued against, because the number of shots, ball possession percentage, and shots inside the opponent's penalty area do not always accurately reflect the outcome of the match. The rapid pace of technological advancements in data collection, storage, and analysis have had a revolutionary impact on soccer analytics over the last decade. Thanks to these advancements, soccer data is collected in two main forms: event data consists of ball-related events and where on the field they occurred such as shots, passes, tackles, and dribbles while tracking data consists of the position of players and the ball throughout play on the pitch. The technological revolution has made it possible to propose a large number of key performance indicators to measure different aspects of the game, such as pass evaluation, quantification of controlled space, shot evaluation, and goal-scoring opportunities using possession values.

data mining, machine learning, natural language, (17 more...)

2308.15559

Country:

Europe > Poland (0.14)
Asia > Middle East > Republic of Türkiye (0.14)
Europe > United Kingdom (0.14)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

arXiv.org Artificial IntelligenceJun-30-2023

The Effect of Balancing Methods on Model Behavior in Imbalanced Classification Problems

Stando, Adrian, Cavus, Mustafa, Biecek, Przemysław

Imbalanced data poses a significant challenge in classification as model performance is affected by insufficient learning from minority classes. Balancing methods are often used to address this problem. However, such techniques can lead to problems such as overfitting or loss of information. This study addresses a more challenging aspect of balancing methods - their impact on model behavior. To capture these changes, Explainable Artificial Intelligence tools are used to compare models trained on datasets before and after balancing. In addition to the variable importance method, this study uses the partial dependence profile and accumulated local effects techniques. Real and simulated datasets are tested, and an open-source Python package edgaro is developed to facilitate this analysis. The results obtained show significant changes in model behavior due to balancing methods, which can lead to biased models toward a balanced distribution. These findings confirm that balancing analysis should go beyond model performance comparisons to achieve higher reliability of machine learning models. Therefore, we propose a new method performance gain plot for informed data balancing strategy to make an optimal selection of balancing method by analyzing the measure of change in model behavior versus performance gain.

artificial intelligence, machine learning, natural language, (21 more...)

2307.00157

Country:

Europe > Poland (0.15)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.46)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

arXiv.org Artificial IntelligenceSep-6-2022

Explainable expected goal models for performance analysis in football analytics

Cavus, Mustafa, Biecek, Przemysław

The expected goal provides a more representative measure of the team and player performance which also suit the low-scoring nature of football instead of score in modern football. The score of a match involves randomness and often may not represent the performance of the teams and players, therefore it has been popular to use the alternative statistics in recent years such as shots on target, ball possessions, and drills. To measure the probability of a shot being a goal by the expected goal, several features are used to train an expected goal model which is based on the event and tracking football data. The selection of these features, the size and date of the data, and the model which are used as the parameters that may affect the performance of the model. Using black-box machine learning models for increasing the predictive performance of the model decreases its interpretability that causes the loss of information that can be gathered from the model. This paper proposes an accurate expected goal model trained consisting of 315,430 shots from seven seasons between 2014-15 and 2020-21 of the top-five European football leagues. Moreover, this model is explained by using explainable artificial intelligence tool to obtain an explainable expected goal model for evaluating a team or player performance. To the best of our knowledge, this is the first paper that demonstrates a practical application of an explainable artificial intelligence tool aggregated profiles to explain a group of observations on an accurate expected goal model for monitoring the team and player performance. Moreover, these methods can be generalized to other sports branches.

artificial intelligence, data mining, machine learning, (16 more...)

doi: 10.1109/DSAA54385.2022.10032440

2206.07212

Country: Europe (0.69)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)