match outcome
Intransitive Player Dominance and Market Inefficiency in Tennis Forecasting: A Graph Neural Network Approach
Clegg, Lawrence, Cartlidge, John
Considerable effort has also been devoted to developing highly accurate models for forecasting match outcomes (Wunderlich and Memmert, 2021). Tennis is a sport well-suited to predictive modelling, with dense tournament schedules generating extensive historical data. The official ranking systems of the Association of Tennis Professionals (ATP) and Women's Tennis Association (WTA) have been shown to exhibit some predictive power for match outcomes (Clarke and Dyte, 2000; Klaassen and Magnus, 2003), but there are notable limitations: for example, ranking points accumulate over a 52-week period, without decay, which can mask recent changes in player form; while match-specific factors, such as surface type, tournament progression difficulty, and margin of victory in individual matches, are overlooked. Some well-known methods have been applied to tennis and modified to accommodate these factors, such as a Bradley-Terry model with surface-specific adjustments (McHale and Morton, 2011) or Elo rating systems that incorporate margin of victory (Kovalchik, 2020; Angelini et al., 2022). Bookmakers are considered the most accurate predictors of match outcomes (Kovalchik, 2016), with sophisticated models that adjust odds based on betting patterns and proprietary methods. Yet, despite the multi-billion dollar betting industry, one limitation that persists is the poor consideration of intransitivity (van Ours, 2025). Intransitivity is analogous to rock-paper-scissors. In tennis, it occurs where player A tends to defeat B, B defeats C, yet C defeats A, violating the assumption of transitive dominance.
Player-Team Heterogeneous Interaction Graph Transformer for Soccer Outcome Prediction
Wang, Lintao, Xu, Shiwen, Horton, Michael, Gudmundsson, Joachim, Wang, Zhiyong
Predicting soccer match outcomes is a challenging task due to the inherently unpredictable nature of the game and the numerous dynamic factors influencing results. While it conventionally relies on meticulous feature engineering, deep learning techniques have recently shown a great promise in learning effective player and team representations directly for soccer outcome prediction. However, existing methods often overlook the heterogeneous nature of interactions among players and teams, which is crucial for accurately modeling match dynamics. To address this gap, we propose HIGFormer (Heterogeneous Interaction Graph Transformer), a novel graph-augmented transformer-based deep learning model for soccer outcome prediction. HIGFormer introduces a multi-level interaction framework that captures both fine-grained player dynamics and high-level team interactions. Specifically, it comprises (1) a Player Interaction Network, which encodes player performance through heterogeneous interaction graphs, combining local graph convolutions with a global graph-augmented transformer; (2) a Team Interaction Network, which constructs interaction graphs from a team-to-team perspective to model historical match relationships; and (3) a Match Comparison Transformer, which jointly analyzes both team and player-level information to predict match outcomes. Extensive experiments on the WyScout Open Access Dataset, a large-scale real-world soccer dataset, demonstrate that HIGFormer significantly outperforms existing methods in prediction accuracy. Furthermore, we provide valuable insights into leveraging our model for player performance evaluation, offering a new perspective on talent scouting and team strategy analysis.
HydraNet: Momentum-Driven State Space Duality for Multi-Granularity Tennis Tournaments Analysis
Li, Ruijie, Zhao, Xiang, Ning, Qiao, Guo, Shikai
In tennis tournaments, momentum, a critical yet elusive phenomenon, reflects the dynamic shifts in performance of athletes that can decisively influence match outcomes. Despite its significance, momentum in terms of effective modeling and multi-granularity analysis across points, games, sets, and matches in tennis tournaments remains underexplored. In this study, we define a novel Momentum Score (MS) metric to quantify a player's momentum level in multi-granularity tennis tournaments, and design HydraNet, a momentum-driven state-space duality-based framework, to model MS by integrating thirty-two heterogeneous dimensions of athletes performance in serve, return, psychology and fatigue. HydraNet integrates a Hydra module, which builds upon a state-space duality (SSD) framework, capturing explicit momentum with a sliding-window mechanism and implicit momentum through cross-game state propagation. It also introduces a novel Versus Learning method to better enhance the adversarial nature of momentum between the two athletes at a macro level, along with a Collaborative-Adversarial Attention Mechanism (CAAM) for capturing and integrating intra-player and inter-player dynamic momentum at a micro level. Additionally, we construct a million-level tennis cross-tournament dataset spanning from 2012-2023 Wimbledon and 2013-2023 US Open, and validate the multi-granularity modeling capability of HydraNet for the MS metric on this dataset. Extensive experimental evaluations demonstrate that the MS metric constructed by the HydraNet framework provides actionable insights into how momentum impacts outcomes at different granularities, establishing a new foundation for momentum modeling and sports analysis. To the best of our knowledge, this is the first work to explore and effectively model momentum across multiple granularities in professional tennis tournaments.
Enhancing Predictive Accuracy in Tennis: Integrating Fuzzy Logic and CV-GRNN for Dynamic Match Outcome and Player Momentum Analysis
Li, Kechen, Liu, Jiaming, Wu, Zhenyu, Ji, Tianbo
The predictive analysis of match outcomes and player momentum in professional tennis has long been a subject of scholarly debate. In this paper, we introduce a novel approach to game prediction by combining a multi-level fuzzy evaluation model with a CV-GRNN model. We first identify critical statistical indicators via Principal Component Analysis and then develop a two-tier fuzzy model based on the Wimbledon data. In addition, the results of Pearson Correlation Coefficient indicate that the momentum indicators, such as Player Win Streak and Score Difference, have a strong correlation among them, revealing insightful trends among players transitioning between losing and winning streaks. Subsequently, we refine the CV-GRNN model by incorporating 15 statistically significant indicators, resulting in an increase in accuracy to 86.64% and a decrease in MSE by 49.21%. This consequently strengthens the methodological framework for predicting tennis match outcomes, emphasizing its practical utility and potential for adaptation in various athletic contexts.
March Madness Tournament Predictions Model: A Mathematical Modeling Approach
McIver, Christian, Avalos, Karla, Nayak, Nikhil
This paper proposes a model to predict the outcome of the March Madness tournament based on historical NCAA basketball data since 2013. The framework of this project is a simplification of the FiveThrityEight NCAA March Madness prediction model, where the only four predictors of interest are Adjusted Offensive Efficiency (ADJOE), Adjusted Defensive Efficiency (ADJDE), Power Rating, and Two-Point Shooting Percentage Allowed. A logistic regression was utilized with the aforementioned metrics to generate a probability of a particular team winning each game. Then, a tournament simulation is developed and compared to real-world March Madness brackets to determine the accuracy of the model. Accuracies of performance were calculated using a naive approach and a Spearman rank correlation coefficient.
Voice Communication Analysis in Esports
Vinot, Aymeric, Perez, Nicolas
In most team-based esports, voice communications are prominent in the team efficiency and synergy. In fact it has been observed that not only the skill aspect of the team but also the team effective voice communication comes into play when trying to have good performance in official matches. With the recent emergence of LLM (Large Language Models) tools regarding NLP (Natural Language Processing) [18], we decided to try applying them in order to have a better understanding on how to improve the effectiveness of the voice communications. In this paper the study has been made through the prism of League of Legends esport. However the main concepts and ideas can be easily applicable in any other team related esports.
Skill Issues: An Analysis of CS:GO Skill Rating Systems
Bober-Irizar, Mikel, Dua, Naunidh, McGuinness, Max
The meteoric rise of online games has created a need for accurate skill rating systems for tracking improvement and fair matchmaking. Although many skill rating systems are deployed, with various theoretical foundations, less work has been done at analysing the real-world performance of these algorithms. In this paper, we perform an empirical analysis of Elo, Glicko2 and TrueSkill through the lens of surrogate modelling, where skill ratings influence future matchmaking with a configurable acquisition function. We look both at overall performance and data efficiency, and perform a sensitivity analysis based on a large dataset of Counter-Strike: Global Offensive matches.
Predicting soccer matches with complex networks and machine learning
Baratela, Eduardo Alves, Xavier, Felipe Jordรฃo, Peron, Thomas, Villas-Boas, Paulino Ribeiro, Rodrigues, Francisco Aparecido
Soccer attracts the attention of many researchers and professionals in the sports industry. Therefore, the incorporation of science into the sport is constantly growing, with increasing investments in performance analysis and sports prediction industries. This study aims to (i) highlight the use of complex networks as an alternative tool for predicting soccer match outcomes, and (ii) show how the combination of structural analysis of passing networks with match statistical data can provide deeper insights into the game patterns and strategies used by teams. In order to do so, complex network metrics and match statistics were used to build machine learning models that predict the wins and losses of soccer teams in different leagues. The results showed that models based on passing networks were as effective as ``traditional'' models, which use general match statistics. Another finding was that by combining both approaches, more accurate models were obtained than when they were used separately, demonstrating that the fusion of such approaches can offer a deeper understanding of game patterns, allowing the comprehension of tactics employed by teams relationships between players, their positions, and interactions during matches. It is worth mentioning that both network metrics and match statistics were important and impactful for the mixed model. Furthermore, the use of networks with a lower granularity of temporal evolution (such as creating a network for each half of the match) performed better than a single network for the entire game.
The Evolution of Football Betting- A Machine Learning Approach to Match Outcome Forecasting and Bookmaker Odds Estimation
This paper explores the significant history of professional football and the betting industry, tracing its evolution from clandestine beginnings to a lucrative multi-million-pound enterprise. Initiated by the legalization of gambling in 1960 and complemented by advancements in football data gathering pioneered by Thorold Charles Reep, the symbiotic relationship between these sectors has propelled rapid growth and innovation. Over the past six decades, both industries have undergone radical transformations, with data collection methods evolving from rudimentary notetaking to sophisticated technologies such as high-definition cameras and Artificial Intelligence (AI)-driven analytics. Therefore, the primary aim of this study is to utilize Machine Learning (ML) algorithms to forecast premier league football match outcomes. By analyzing historical data and investigating the significance of various features, the study seeks to identify the most effective predictive models and discern key factors influencing match results. Additionally, the study aims to utilize these forecasting to inform the establishment of bookmaker odds, providing insights into the impact of different variables on match outcomes. By highlighting the potential for informed decision-making in sports forecasting and betting, this study opens up new avenues for research and practical applications in the domain of sports analytics.
Machine Learning for Soccer Match Result Prediction
Bunker, Rory, Yeung, Calvin, Fujii, Keisuke
Machine learning has become a common approach to predicting the outcomes of soccer matches, and the body of literature in this domain has grown substantially in the past decade and a half. This chapter discusses available datasets, the types of models and features, and ways of evaluating model performance in this application domain. The aim of this chapter is to give a broad overview of the current state and potential future developments in machine learning for soccer match results prediction, as a resource for those interested in conducting future studies in the area. Our main findings are that while gradient-boosted tree models such as CatBoost, applied to soccer-specific ratings such as pi-ratings, are currently the best-performing models on datasets containing only goals as the match features, there needs to be a more thorough comparison of the performance of deep learning models and Random Forest on a range of datasets with different types of features. Furthermore, new rating systems using both player- and team-level information and incorporating additional information from, e.g., spatiotemporal tracking and event data, could be investigated further. Finally, the interpretability of match result prediction models needs to be enhanced for them to be more useful for team management.