AITopics

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Neural Information Processing SystemsDec-25-2025, 01:01:33 GMT

Hedging as Reward Augmentation in Probabilistic Graphical Models

We argue that hedging is an activity that human and machine agents should engage in more broadly, even when the agent's value is not necessarily in monetary units. In this paper, we propose a decision-theoretic view of hedging based on augmenting a probabilistic graphical model -- specifically a Bayesian network or an influence diagram -- with a reward. Hedging is therefore posed as a particular kind of graph manipulation, and can be viewed as analogous to control/intervention and information gathering related analysis. Effective hedging occurs when a risk-averse agent finds opportunity to balance uncertain rewards in their current situation. We illustrate the concepts with examples and counter-examples, and conduct experiments to demonstrate the properties and applicability of the proposed computational tools that enable agents to proactively identify potential hedging opportunities in real-world situations.

hedging, probabilistic graphical model, reward augmentation, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.61)

Ahmadi, Zaniar, Godin, Frédéric

Learning to Hedge Swaptions

arXiv.org Artificial IntelligenceDec-9-2025

This paper investigates the deep hedging framework, based on reinforcement learning (RL), for the dynamic hedging of swaptions, contrasting its performance with traditional sensitivity-based rho-hedging. We design agents under three distinct objective functions (mean squared error, downside risk, and Conditional Value-at-Risk) to capture alternative risk preferences and evaluate how these objectives shape hedging styles. Relying on a three-factor arbitrage-free dynamic Nelson-Siegel model for our simulation experiments, our findings show that near-optimal hedging effectiveness is achieved when using two swaps as hedging instruments. Deep hedging strategies dynamically adapt the hedging portfolio's exposure to risk factors across states of the market. In our experiments, their out-performance over rho-hedging strategies persists even in the presence some of model misspecification. These results highlight RL's potential to deliver more efficient and resilient swaption hedging strategies.

machine learning, reinforcement learning, swap, (18 more...)

2512.06639

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Artificial IntelligenceOct-21-2025

A Topological Approach to Parameterizing Deep Hedging Networks

Das, Alok, Lee, Kiseop

The classical hedging problem entails replicating the payoff of a contingent claim under a certain stochastic model. While we can find a complete hedging strategy in a complete market like Black-Scholes, a market is in general incomplete, including jump diffusion, and stochastic volatility models. While there are several hedging approaches in an incomplete market, it is often very difficult to get a closed form solution or even calculate numerically. Even in a complete market like Black-Scholes, there are drawbacks to this strategy in both execution and the theory it is based on. A traditional asset pricing and hedging method assumes frictionless markets, perfect liquidity, and normally distributed returns among many other conditions.

artificial intelligence, batch size, machine learning, (15 more...)

2510.16938

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Industry: Banking & Finance > Trading (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Bracha, Zofia, Sakowski, Paweł, Michańków, Jakub

Application of Deep Reinforcement Learning to At-the-Money S&P 500 Options Hedging

arXiv.org Artificial IntelligenceOct-13-2025

This paper explores the application of deep Q-learning to hedging at-the-money options on the S\&P~500 index. We develop an agent based on the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm, trained to simulate hedging decisions without making explicit model assumptions on price dynamics. The agent was trained on historical intraday prices of S\&P~500 call options across years 2004--2024, using a single time series of six predictor variables: option price, underlying asset price, moneyness, time to maturity, realized volatility, and current hedge position. A walk-forward procedure was applied for training, which led to nearly 17~years of out-of-sample evaluation. The performance of the deep reinforcement learning (DRL) agent is benchmarked against the Black--Scholes delta-hedging strategy over the same period. We assess both approaches using metrics such as annualized return, volatility, information ratio, and Sharpe ratio. To test the models' adaptability, we performed simulations across varying market conditions and added constraints such as transaction costs and risk-awareness penalties. Our results show that the DRL agent can outperform traditional hedging methods, particularly in volatile or high-cost environments, highlighting its robustness and flexibility in practical trading contexts. While the agent consistently outperforms delta-hedging, its performance deteriorates when the risk-awareness parameter is higher. We also observed that the longer the time interval used for volatility estimation, the more stable the results.

machine learning, reinforcement learning, volatility, (14 more...)

2510.09247

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Neural Information Processing SystemsOct-9-2025, 16:30:00 GMT

b2647998bc953781490049fe2ac28bf0-Paper-Conference.pdf

decision maker, hedge, utility function, (14 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(5 more...)

Industry:

Banking & Finance > Trading (1.00)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Decision Support Systems (0.98)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.96)

Neagu, Andrei, Godin, Frédéric, Kosseim, Leila

Deep Reinforcement Learning Algorithms for Option Hedging

arXiv.org Artificial IntelligenceApr-18-2025

Dynamic hedging is a financial strategy that consists in periodically transacting one or multiple financial assets to offset the risk associated with a correlated liability. Deep Reinforcement Learning (DRL) algorithms have been used to find optimal solutions to dynamic hedging problems by framing them as sequential decision-making problems. However, most previous work assesses the performance of only one or two DRL algorithms, making an objective comparison across algorithms difficult. In this paper, we compare the performance of eight DRL algorithms in the context of dynamic hedging; Monte Carlo Policy Gradient (MCPG), Proximal Policy Optimization (PPO), along with four variants of Deep Q-Learning (DQL) and two variants of Deep Deterministic Policy Gradient (DDPG). Two of these variants represent a novel application to the task of dynamic hedging. In our experiments, we use the Black-Scholes delta hedge as a baseline and simulate the dataset using a GJR-GARCH(1,1) model. Results show that MCPG, followed by PPO, obtain the best performance in terms of the root semi-quadratic penalty. Moreover, MCPG is the only algorithm to outperform the Black-Scholes delta hedge baseline with the allotted computational budget, possibly due to the sparsity of rewards in our environment.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2504.05521

Country: North America > Canada (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsFeb-7-2025, 02:17:06 GMT

Review for NeurIPS paper: Hedging in games: Faster convergence of external and swap regrets

Summary and Contributions: Standard low-regret algorithms guarantee O(sqrt(T)) regret after T rounds (which is tight). One common application of low-regret algorithms is to play an n-action game (in settings like this, it is known that if all players are running low-regret algorithms, then their empirical strategies will converge to specific types of equilibria for the game). It was shown in a series of works that in this more structured setting, it is possible to design algorithms with better regret guarantees; in particular, Syrgkanis et al show that an algorithm known as "Optimistic Hedge" (a generalization of the standard "hedge" / multiplicative weights algorithm) achieves regret bounds on the order of O(T {1/4}) when players in 2 player games both play it. This paper examines the Optimistic Hedge algorithm in further detail, significantly improving the bounds shown by Syrgkanis et al. Specifically, this paper: 1. Shows that if both players in a two-player game run Optimistic Hedge, each player's regret is at most O (T {1/6}). These improvements rely on the fact that Optimistic Hedge converges quickly if the loss vectors and strategies it outputs are relatively stable.

algorithm, external and swap regret, optimistic hedge, (10 more...)

Genre: Research Report (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Neural Information Processing SystemsJan-18-2025, 14:41:02 GMT

Hedging as Reward Augmentation in Probabilistic Graphical Models

hedging, probabilistic graphical model, reward augmentation, (1 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

arXiv.org Artificial IntelligenceOct-29-2024

Fast Deep Hedging with Second-Order Optimization

Mueller, Konrad, Akkari, Amira, Gonon, Lukas, Wood, Ben

Hedging exotic options in presence of market frictions is an important risk management task. Deep hedging can solve such hedging problems by training neural network policies in realistic simulated markets. Training these neural networks may be delicate and suffer from slow convergence, particularly for options with long maturities and complex sensitivities to market parameters. To address this, we propose a second-order optimization scheme for deep hedging. We leverage pathwise differentiability to construct a curvature matrix, which we approximate as block-diagonal and Kronecker-factored to efficiently precondition gradients. We evaluate our method on a challenging and practically important problem: hedging a cliquet option on a stock with stochastic volatility by trading in the spot and vanilla options. We find that our second-order scheme can optimize the policy in 1/4 of the number of steps that standard adaptive moment-based optimization takes.

approximation, artificial intelligence, machine learning, (15 more...)

2410.22568

Country:

Europe > United Kingdom > England > Greater London > London (0.05)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.50)

Industry: Banking & Finance (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)