Goto

Collaborating Authors

 wager


The EU AI Act and the Wager on Trustworthy AI

Communications of the ACM

Artificial intelligence (AI) systems are increasingly supplementing or taking over tasks previously performed by humans. On the one hand, this relates to low-risk tasks, such as recommending books or movies, or recommending purchases based on previous buying behavior. But it also includes crucial decision making by highly autonomous systems. Many current systems are opaque in the sense that their internal principles of operation are unknown, leading to severe safety and regulation problems. Once trained, deep-learning systems perform well, but they are subject to surprising vulnerabilities when confronted with adversarial images.9 The decisions may be explicated after the fact, but these systems carry the risk of wrong decisions affecting the well being of people.


Sample size planning for conditional counterfactual mean estimation with a K-armed randomized experiment

arXiv.org Machine Learning

We cover how to determine a sufficiently large sample size for a $K$-armed randomized experiment in order to estimate conditional counterfactual expectations in data-driven subgroups. The sub-groups can be output by any feature space partitioning algorithm, including as defined by binning users having similar predictive scores or as defined by a learned policy tree. After carefully specifying the inference target, a minimum confidence level, and a maximum margin of error, the key is to turn the original goal into a simultaneous inference problem where the recommended sample size to offset an increased possibility of estimation error is directly related to the number of inferences to be conducted. Given a fixed sample size budget, our result allows us to invert the question to one about the feasible number of treatment arms or partition complexity (e.g. number of decision tree leaves). Using policy trees to learn sub-groups, we evaluate our nominal guarantees on a large publicly-available randomized experiment test data set.


What Makes Forest-Based Heterogeneous Treatment Effect Estimators Work?

arXiv.org Machine Learning

Estimation of heterogeneous treatment effects (HTE) is of prime importance in many disciplines, ranging from personalized medicine to economics among many others. Random forests have been shown to be a flexible and powerful approach to HTE estimation in both randomized trials and observational studies. In particular "causal forests", introduced by Athey, Tibshirani and Wager (2019), along with the R implementation in package grf were rapidly adopted. A related approach, called "model-based forests", that is geared towards randomized trials and simultaneously captures effects of both prognostic and predictive variables, was introduced by Seibold, Zeileis and Hothorn (2018) along with a modular implementation in the R package model4you. Here, we present a unifying view that goes beyond the theoretical motivations and investigates which computational elements make causal forests so successful and how these can be blended with the strengths of model-based forests. To do so, we show that both methods can be understood in terms of the same parameters and model assumptions for an additive model under L2 loss. This theoretical insight allows us to implement several flavors of "model-based causal forests" and dissect their different elements in silico. The original causal forests and model-based forests are compared with the new blended versions in a benchmark study exploring both randomized trials and observational settings. In the randomized setting, both approaches performed akin. If confounding was present in the data generating process, we found local centering of the treatment indicator with the corresponding propensities to be the main driver for good performance. Local centering of the outcome was less important, and might be replaced or enhanced by simultaneous split selection with respect to both prognostic and predictive effects.


Personalized Assignment to One of Many Treatment Arms via Regularized and Clustered Joint Assignment Forests

arXiv.org Machine Learning

We consider learning personalized assignments to one of many treatment arms from a randomized controlled trial. Standard methods that estimate heterogeneous treatment effects separately for each arm may perform poorly in this case due to excess variance. We instead propose methods that pool information across treatment arms: First, we consider a regularized forest-based assignment algorithm based on greedy recursive partitioning that shrinks effect estimates across arms. Second, we augment our algorithm by a clustering scheme that combines treatment arms with consistently similar outcomes. In a simulation study, we compare the performance of these approaches to predicting arm-wise outcomes separately, and document gains of directly optimizing the treatment assignment with regularization and clustering. In a theoretical model, we illustrate how a high number of treatment arms makes finding the best arm hard, while we can achieve sizable utility gains from personalization by regularized optimization.


Accelerating Generalized Random Forests with Fixed-Point Trees

arXiv.org Artificial Intelligence

Generalized random forests arXiv:1610.01271 build upon the well-established success of conventional forests (Breiman, 2001) to offer a flexible and powerful non-parametric method for estimating local solutions of heterogeneous estimating equations. Estimators are constructed by leveraging random forests as an adaptive kernel weighting algorithm and implemented through a gradient-based tree-growing procedure. By expressing this gradient-based approximation as being induced from a single Newton-Raphson root-finding iteration, and drawing upon the connection between estimating equations and fixed-point problems arXiv:2110.11074, we propose a new tree-growing rule for generalized random forests induced from a fixed-point iteration type of approximation, enabling gradient-free optimization, and yielding substantial time savings for tasks involving even modest dimensionality of the target quantity (e.g. multiple/multi-level treatment effects). We develop an asymptotic theory for estimators obtained from forests whose trees are grown through the fixed-point splitting rule, and provide numerical simulations demonstrating that the estimators obtained from such forests are comparable to those obtained from the more costly gradient-based rule.


Exploring Potential Longevity Applications of Rapamycin With ChatGPT

#artificialintelligence

In 2020 I joined the private beta test of Open AI's Generative Pre-trained Transformer 3 (GPT-3), which is an earlier version of ChatGPT. When ChatGPT was released in November 2022, I started experimenting with it. Large language models like ChatGPT are expected to enable a new wave of research, creativity and productivity, because they can help generate solutions for complex problems. For over two years I've been exploring the strengths and limits of this technology and assessing how this tool could be useful to me. I'm also interested how this new technology is being utilized by scientists to make meaningful contributions to academic work.


How Artificial Intelligence Can Help the Online Sports Betting Industry Succeed

#artificialintelligence

Artificial intelligence (AI) is beginning to have a significant impact on the global online sports betting industry. Sportsbooks are able to provide more precise odds and forecasts due to AI, which can assist bettors place more profitable wagers. AI can also assist in identifying future problem gamblers and preventing them from accruing excessive debt. Sportsbooks can identify warning indicators and provide support to individuals who require it by utilising AI to track betting trends and monitor betting patterns. Overall, AI is enhancing everyone's enjoyment and fairness in the field of online sports betting.


How artificial intelligence drives new experiences in esports betting

#artificialintelligence

Artificial intelligence is no longer science fiction โ€“ it is being used everywhere you look, from e-commerce to architecture. Gambling involves a lot of luck but also preparation. Bookmakers now benefit from real-time statistics about esports players, teams and events that inform betting odds and provide context to bettors. AI can process enormous amounts of data very quickly and make predictions accordingly. PandaScore's AI platform, for example, collects 300 data points in League of Legends in half a second.


Five Leading Sports Analytics Software Programs

#artificialintelligence

Sports analytics is one of the biggest sectors booming in the sports world. Though sports have captured the attention of the public and investors alike, analytics is a behind-the-scenes industry that combines the latest in machine-learning algorithms and data crunching. Some programs, like those that rely on AI, are designed to make predictions by studying huge amounts of historical data. Others, such as analytics software, are designed to make immediate conclusions from live data points. Teams rely on sports analytics to make leaner decisions related to recruitment, training regimens, and more.


Accumulator Bet Selection Through Stochastic Diffusion Search

arXiv.org Artificial Intelligence

The global sports betting market is worth an estimated $700 billion annually Flepp et al. (2017), and association football (also known as soccer or simply football), being the world's most popular spectator sport, constitutes around 70% of this ever-growing market Constantinou et al. (2012). The last decade has thus seen the emergence of numerous online and offline bookmakers, offering bettors the possibility to place wagers on the results of football matches in more than a hundred different leagues, worldwide. The sports betting industry offers a unique and very popular betting product known as an accumulator bet. In contrast with a single bet, which consists in betting on a single event for a payout equal to the stake (i.e. the sum wagered) multiplied by the odds set by the bookmaker for that event, an accumulator bet combines more than one (and generally less than seven) events into a single wager that pays out only when all individual events are correctly predicted. The payout for a correct accumulator bet is the stake multiplied by the product of the odds of all its constituting wagers. However, if one of these wagers is incorrect, the entire accumulator bet would lose. Thus, this product offers both significantly higher potential payouts and higher risks than single bets, and the large pool of online bookmakers, leagues and, matches that bettors can access nowadays has increased both the complexity of selecting a set of matches to place an accumulator bet on, and the number of opportunities to identify winning combinations. With the rise of sports analytics, a wide variety of statistical models for predicting the outcomes of football matches have been proposed, a good review of which can be found in Langseth (2013).