We address a portfolio selection problem that combines active (outperformance) and passive (tracking) objectives using techniques from convex analysis. We assume a general semimartingale market model where the assets' growth rate processes are driven by a latent factor. Using techniques from convex analysis we obtain a closed-form solution for the optimal portfolio and provide a theorem establishing its uniqueness. The motivation for incorporating latent factors is to achieve improved growth rate estimation, an otherwise notoriously difficult task. To this end, we focus on a model where growth rates are driven by an unobservable Markov chain. The solution in this case requires a filtering step to obtain posterior probabilities for the state of the Markov chain from asset price information, which are subsequently used to find the optimal allocation. We show the optimal strategy is the posterior average of the optimal strategies the investor would have held in each state assuming the Markov chain remains in that state. Finally, we implement a number of historical backtests to demonstrate the performance of the optimal portfolio.
Optimal trade execution is an important problem faced by essentially all traders. Much research into optimal execution uses stringent model assumptions and applies continuous time stochastic control to solve them. Here, we instead take a model free approach and develop a variation of Deep Q-Learning to estimate the optimal actions of a trader. The model is a fully connected Neural Network trained using Experience Replay and Double DQN with input features given by the current state of the limit order book, other trading signals, and available execution actions, while the output is the Q-value function estimating the future rewards under an arbitrary action. We apply our model to nine different stocks and find that it outperforms the standard benchmark approach on most stocks using the measures of (i) mean and median out-performance, (ii) probability of out-performance, and (iii) gain-loss ratios.
We present a simple model of a non-equilibrium self-organizing market where asset prices are partially driven by investment decisions of a bounded-rational agent. The agent acts in a stochastic market environment driven by various exogenous "alpha" signals, agent's own actions (via market impact), and noise. Unlike traditional agent-based models, our agent aggregates all traders in the market, rather than being a representative agent. Therefore, it can be identified with a bounded-rational component of the market itself, providing a particular implementation of an Invisible Hand market mechanism. In such setting, market dynamics are modeled as a fictitious self-play of such bounded-rational market-agent in its adversarial stochastic environment. As rewards obtained by such self-playing market agent are not observed from market data, we formulate and solve a simple model of such market dynamics based on a neuroscience-inspired Bounded Rational Information Theoretic Inverse Reinforcement Learning (BRIT-IRL). This results in effective asset price dynamics with a non-linear mean reversion - which in our model is generated dynamically, rather than being postulated. We argue that our model can be used in a similar way to the Black-Litterman model. In particular, it represents, in a simple modeling framework, market views of common predictive signals, market impacts and implied optimal dynamic portfolio allocations, and can be used to assess values of private signals. Moreover, it allows one to quantify a "market-implied" optimal investment strategy, along with a measure of market rationality. Our approach is numerically light, and can be implemented using standard off-the-shelf software such as TensorFlow.
We develop a methodology for detecting asset bubbles using a neural network. We rely on the theory of local martingales in continuous-time and use a deep network to estimate the diffusion coefficient of the price process more accurately than the current estimator, obtaining an improved detection of bubbles. We show the outperformance of our algorithm over the existing statistical method in a laboratory created with simulated data. We then apply the network classification to real data and build a zero net exposure trading strategy that exploits the risky arbitrage emanating from the presence of bubbles in the US equity market from 2006 to 2008. The profitability of the strategy provides an estimation of the economical magnitude of bubbles as well as support for the theoretical assumptions relied on.