AITopics | Groll, Andreas

Collaborating Authors

Groll, Andreas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Time-to-event prediction for grouped variables using Exclusive Lasso

Ravi, Dayasri, Groll, Andreas

arXiv.org Machine LearningApr-2-2025

The integration of high-dimensional genomic data and clinical data into time-to-event prediction models has gained significant attention due to the growing availability of these datasets. Traditionally, a Cox regression model is employed, concatenating various covariate types linearly. Given that much of the data may be redundant or irrelevant, feature selection through penalization is often desirable. A notable characteristic of these datasets is their organization into blocks of distinct data types, such as methylation and clinical predictors, which requires selecting a subset of covariates from each group due to high intra-group correlations. For this reason, we propose utilizing Exclusive Lasso regularization in place of standard Lasso penalization. We apply our methodology to a real-life cancer dataset, demonstrating enhanced survival prediction performance compared to the conventional Cox regression model.

artificial intelligence, lasso, machine learning, (15 more...)

arXiv.org Machine Learning

2504.0152

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Add feedback

A Machine Learning-based Anomaly Detection Framework in Life Insurance Contracts

Groll, Andreas, Khanna, Akshat, Zeldin, Leonid

arXiv.org Artificial IntelligenceNov-26-2024

Life insurance, like other forms of insurance, relies heavily on large volumes of data. The business model is based on an exchange where companies receive payments in return for the promise to provide coverage in case of an accident. Thus, trust in the integrity of the data stored in databases is crucial. One method to ensure data reliability is the automatic detection of anomalies. While this approach is highly useful, it is also challenging due to the scarcity of labeled data that distinguish between normal and anomalous contracts or inter\-actions. This manuscript discusses several classical and modern unsupervised anomaly detection methods and compares their performance across two different datasets. In order to facilitate the adoption of these methods by companies, this work also explores ways to automate the process, making it accessible even to non-data scientists.

anomaly, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2411.17495

Genre: Research Report (0.64)

Industry:

Banking & Finance > Insurance (1.00)
Health & Medicine > Therapeutic Area > Endocrinology (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Modeling and Prediction of the UEFA EURO 2024 via Combined Statistical Learning Approaches

Groll, Andreas, Hvattum, Lars M., Ley, Christophe, Sternemann, Jonas, Schauberger, Gunther, Zeileis, Achim

arXiv.org Artificial IntelligenceOct-1-2024

In this work, three fundamentally different machine learning models are combined to create a new, joint model for forecasting the UEFA EURO 2024. Therefore, a generalized linear model, a random forest model, and a extreme gradient boosting model are used to predict the number of goals a team scores in a match. The three models are trained on the match results of the UEFA EUROs 2004-2020, with additional covariates characterizing the teams for each tournament as well as three enhanced variables derived from different ranking methods for football teams. The first enhanced variable is based on historic match data from national teams, the second is based on the bookmakers' tournament winning odds of all participating teams, and the third is based on historic match data of individual players both for club and international matches, resulting in player ratings. Then, based on current covariate information of the participating teams, the final trained model is used to predict the UEFA EURO 2024. For this purpose, the tournament is simulated 100.000 times, based on the estimated expected number of goals for all possible matches, from which probabilities across the different tournament stages are derived. Our combined model identifies France as the clear favourite with a winning probability of 19.2%, followed by England (16.7%) and host Germany (13.7%).

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

2410.09068

Country:

Europe > Germany (0.49)
Europe > United Kingdom > England (0.25)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Machine Learning for Multi-Output Regression: When should a holistic multivariate approach be preferred over separate univariate ones?

Schmid, Lena, Gerharz, Alexander, Groll, Andreas, Pauly, Markus

arXiv.org Machine LearningJan-14-2022

The hope of such multivariate analyses is, that the consideration of possible dependencies between the outcomes may lead to procedures with better power (in case of inference) or accuracy (in case of prediction) compared to separate univariate analyses. While the need for the development and use of valid and distributional robust or nonparametric multivariate methods has been recognized and addressed in inferential statistic (Dobler et al., 2020; Friedrich et al., 2019; Konietschke et al., 2015; Smaga, 2017; Vallejo and Ato, 2012; Zimmermann et al., 2020), there do not exist exhausting studies that exploit the potential of multivariate regression methods for prediction. Focussing on tree-based ensemble methods as the Random Forest, it is the aim of this manuscript to close this gap. In particular, we want to answer our research-motivating question: When should a holistic multivariate regression approach be preferred over separate univariate predictions? Corresponding Author Email address: lena.schmid@tu-dortmund.de (Lena Schmid)

artificial intelligence, machine learning, random forest, (15 more...)

arXiv.org Machine Learning

2201.0534

Country:

Europe > Germany > North Rhine-Westphalia (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Deducing neighborhoods of classes from a fitted model

Gerharz, Alexander, Groll, Andreas, Schauberger, Gunther

arXiv.org Machine LearningSep-17-2020

In todays world the request for very complex models for huge data sets is rising steadily. The problem with these models is that by raising the complexity of the models, it gets much harder to interpret them. The growing field of \emph{interpretable machine learning} tries to make up for the lack of interpretability in these complex (or even blackbox-)models by using specific techniques that can help to understand those models better. In this article a new kind of interpretable machine learning method is presented, which can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts. To illustrate in which situations this quantile shift method (QSM) could become beneficial, it is applied to a theoretical medical example and a real data example. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed. By comparing the predictions before and after the manipulations, under certain conditions the observed changes in the predictions can be interpreted as neighborhoods of the classes with regard to the manipulated features. Chordgraphs are used to visualize the observed changes.

artificial intelligence, health & medicine, manipulation, (18 more...)

arXiv.org Machine Learning

2009.05516

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Random boosting and random^2 forests -- A random tree depth injection approach

Krabel, Tobias Markus, Tran, Thi Ngoc Tien, Groll, Andreas, Horn, Daniel, Jentsch, Carsten

arXiv.org Machine LearningSep-13-2020

The induction of additional randomness in parallel and sequential ensemble methods has proven to be worthwhile in many aspects. In this manuscript, we propose and examine a novel random tree depth injection approach suitable for sequential and parallel tree-based approaches including Boosting and Random Forests. The resulting methods are called \emph{Random Boost} and \emph{Random$^2$ Forest}. Both approaches serve as valuable extensions to the existing literature on the gradient boosting framework and random forests. A Monte Carlo simulation, in which tree-shaped data sets with different numbers of final partitions are built, suggests that there are several scenarios where \emph{Random Boost} and \emph{Random$^2$ Forest} can improve the prediction performance of conventional hierarchical boosting and random forest approaches. The new algorithms appear to be especially successful in cases where there are merely a few high-order interactions in the generated data. In addition, our simulations suggest that our random tree depth injection approach can improve computation time by up to 40%, while at the same time the performance losses in terms of prediction accuracy turn out to be minor or even negligible in most cases.

algorithm, artificial intelligence, decision tree learning, (19 more...)

arXiv.org Machine Learning

2009.06078

Country:

Europe > Germany (0.28)
Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Hybrid Machine Learning Forecasts for the FIFA Women's World Cup 2019

Groll, Andreas, Ley, Christophe, Schauberger, Gunther, Van Eetvelde, Hans, Zeileis, Achim

arXiv.org Machine LearningJun-3-2019

In this work, we combine two different ranking methods together with several other predictors in a joint random forest approach for the scores of soccer matches. The first ranking method is based on the bookmaker consensus, the second ranking method estimates adequate ability parameters that reflect the current strength of the teams best. The proposed combined approach is then applied to the data from the two previous FIFA Women's World Cups 2011 and 2015. Finally, based on the resulting estimates, the FIFA Women's World Cup 2019 is simulated repeatedly and winning probabilities are obtained for all teams. The model clearly favors the defending champion USA before the host France.

decision tree learning, probability, soccer, (18 more...)

arXiv.org Machine Learning

1906.01131

Country:

Asia (0.94)
South America (0.94)
North America > United States (0.67)
(2 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.73)

Add feedback