Goto

Collaborating Authors

 prediction market


Wall Street Is Already Betting on Prediction Markets

WIRED

As the legal war over how to regulate prediction markets rages on, financial institutions are embracing the industry anyway. When Troy Dixon first suggested incorporating prediction markets into the electronic trading platform where he works, he was met with incredulity. "People told us we were crazy," Dixon, Tradeweb's cohead of global markets, tells WIRED. But after the company announced it was partnering with Kalshi in February, Dixon says, the mood changed dramatically. "We've been inundated with calls," he says.


The War Over Prediction Markets Is Just Getting Started

WIRED

Prediction markets like Kalshi and Polymarket are booming, and so is a fight among regulators, lawmakers, and advocates over their legality. Former New Jersey governor Chris Christie, who currently serves as an advisor to the American Gaming Association, has criticized prediction markets. The political fight in the US over the future of prediction markets like Polymarket and Kalshi has escalated into a full-blown war, and battle lines aren't being neatly drawn along party lines. Instead, conservative Mormons have aligned themselves with Las Vegas bigwigs and MAGA royalty is siding with liberal Democrat lobbyists. One side argues that the platforms are breaking the law by operating as shadow casinos.


Senators Urge Top Regulator to Stay Out of Prediction Market Lawsuits

WIRED

As prediction market platforms like Polymarket and Kalshi battle regulators in court, Senate Democrats are urging the CFTC to avoid weighing in, escalating a broader fight over the burgeoning industry. Senator Adam Schiff, a Democrat from California, is leading the group of lawmakers urging the CFTC to stay out of state prediction market lawsuits. A group of 23 Democratic US senators sent a letter Friday to the top federal regulator overseeing prediction markets, urging the agency to avoid weighing in on pending court cases over the legality of offerings on the platforms tied to "sports, war, and other prohibited events." Prediction markets, which sell contracts tied to the outcome of real-world developments, have exploded in popularity over the past year, attracting an increasingly mainstream fanbase eager to wager on everything from geopolitical conflicts to fashion choices to the Super Bowl. As they expanded, the platforms have become a magnet for ethical and legal controversies.


Going All-In on LLM Accuracy: Fake Prediction Markets, Real Confidence Signals

Todasco, Michael

arXiv.org Artificial Intelligence

Large language models are increasingly used to evaluate other models, yet these judgments typically lack any representation of confidence. This pilot study tests whether framing an evaluation task as a betting game (a fictional prediction market with its own LLM currency) improves forecasting accuracy and surfaces calibrated confidence signals. We generated 100 math and logic questions with verifiable answers. Six Baseline models (three current-generation, three prior-generation) answered all items. Three Predictor models then forecasted, for each question-baseline pair, if the baseline would answer correctly. Each predictor completed matched runs in two conditions: Control (simple correct/incorrect predictions) and Incentive (predictions plus wagers of 1-100,000 LLMCoin under even odds, starting from a 1,000,000 LLMCoin bankroll). Across 5,400 predictions per condition, Incentive runs showed modestly higher accuracy (81.5% vs. 79.1%, p = .089, d = 0.86) and significantly faster learning across rounds (12.0 vs. 2.9 percentage-point improvement from Round 1 to Round 4, p = .011). Most notably, stake size tracked confidence. "Whale" bets of 40,000+ coins were correct ~99% of the time, while small bets (<1,000 coins) showed only ~74% accuracy. The key finding is not that fictional money makes models smarter; accuracy gains were modest and did not reach statistical significance (p = .089) in this pilot. Rather, the betting mechanic created a legible confidence signal absent from binary yes/no outputs. This suggests that simple financial framing may help transform LLMs into risk-aware forecasters, making their internal beliefs visible and usable. The protocol offers a foundation for future work for meta-evaluation systems and what may become LLM-to-LLM prediction markets.


Semantic Trading: Agentic AI for Clustering and Relationship Discovery in Prediction Markets

Capponi, Agostino, Gliozzo, Alfio, Zhu, Brian

arXiv.org Artificial Intelligence

Prediction markets allow users to trade on outcomes of real-world events, but are prone to fragmentation with overlapping questions, implicit equivalences, and hidden contradictions across markets. We present an agentic AI pipeline that autonomously (i) clusters markets into coherent topical groups using natural-language understanding over contract text and metadata, and (ii) identifies within-cluster market pairs whose resolved outcomes exhibit strong dependence, including "same-outcome" (correlated) and "different-outcome" (anti-correlated) relationships. Using a historical dataset of resolved markets on Poly-market, we evaluate the accuracy of the agent's relational predictions. We then synthesize discovered relationships into a simple trading strategy to quantify how discovered relationships translate into actionable strategies. Results show that agent-identified relationships have around 60-70% accuracy, and their induced trading strategies have an average return of 20% over week-long horizons, highlighting the ability of agen-tic AI and large language models to uncover latent semantic structure within prediction markets.


Outcome-based Reinforcement Learning to Predict the Future

Turtel, Benjamin, Franklin, Danny, Skotheim, Kris, Hewitt, Luke, Schoenegger, Philipp

arXiv.org Artificial Intelligence

Reinforcement Learning with Verifiable Rewards (RLVR) has been an effective approach for improving Large Language Models' reasoning in domains such as coding and mathematics. Here, we apply RLVR methods towards forecasting future real-world events - a challenging task for RL due to the very noisy (and delayed) outcomes involved. Using a novel dataset of recent questions from a prediction market, and accompanying relevant news headlines, we show that a compact (14B) reasoning model can be trained to match or surpass the predictive accuracy of frontier models like o1, while greatly improving probabilistic calibration. The model's performance is also practically meaningful: in a Polymarket trading simulation, we estimate that its bets would have yielded a return on investment of over 10% across all questions in the test set. We detail and compare approaches used in training our model, including augmenting our training-data with synthetic prediction questions, guardrails for learning stability, and median prediction sampling at inference-time.



Bounded-Loss Private Prediction Markets

Rafael Frongillo, Bo Waggoner

Neural Information Processing Systems

Prior work has investigated variations of prediction markets that preserve participants' (differential) privacy, which formed the basis of useful mechanisms for purchasing data for machine learning objectives. Such markets required potentially unlimited financial subsidy, however, making them impractical.


Bounded-Loss Private Prediction Markets

Rafael Frongillo, Bo Waggoner

Neural Information Processing Systems

Prior work has investigated variations of prediction markets that preserve participants' (differential) privacy, which formed the basis of useful mechanisms for purchasing data for machine learning objectives. Such markets required potentially unlimited financial subsidy, however, making them impractical.


AIA Forecaster: Technical Report

Alur, Rohan, Stadie, Bradly C., Kang, Daniel, Chen, Ryan, McManus, Matt, Rickert, Michael, Lee, Tyler, Federici, Michael, Zhu, Richard, Fogerty, Dennis, Williamson, Hayley, Lozinski, Nina, Linsky, Aaron, Sekhon, Jasjeet S.

arXiv.org Artificial Intelligence

This technical report describes the AIA Forecaster, a Large Language Model (LLM)-based system for judgmental forecasting using unstructured data. The AIA Forecaster approach combines three core elements: agentic search over high-quality news sources, a supervisor agent that reconciles disparate forecasts for the same event, and a set of statistical calibration techniques to counter behavioral biases in large language models. On the ForecastBench benchmark (Karger et al., 2024), the AIA Forecaster achieves performance equal to human superforecasters, surpassing prior LLM baselines. In addition to reporting on ForecastBench, we also introduce a more challenging forecasting benchmark sourced from liquid prediction markets. While the AIA Forecaster underperforms market consensus on this benchmark, an ensemble combining AIA Forecaster with market consensus outperforms consensus alone, demonstrating that our forecaster provides additive information. Our work establishes a new state of the art in AI forecasting and provides practical, transferable recommendations for future research. To the best of our knowledge, this is the first work that verifiably achieves expert-level forecasting at scale.