A Risk-Aware Reinforcement Learning Reward for Financial Trading

Srivastava, Uditansh, Aryan, Shivam, Singh, Shaurya

arXiv.org Artificial Intelligence 

We propose a novel composite reward function for a reinforcement learning (RL) trading agent that explicitly balances return and risk by combining four differentiable components--annualized return, downside risk, differential return, and the Treynor ratio. Unlike traditional single-metric objectives (e.g., Sharpe or cumulative return), which can encourage reward hacking or over-optimization of one aspect of trading, our formulation is inherently modular and weighted w