Bias-Variance Trade-off and Overlearning in Dynamic Decision Problems
Reppen, A. Max, Soner, H. Mete
Recent advances in training of neural networks make high-dimensional numerical studies feasible for decision problems in uncertain environments. Although reinforcement learning has been widely used in optimal control for several decades [6], only recently Han and E [18], Han et al. [20] combine it with Monte Carlo type regression for the off-line construction of optimal feedback actions. In these problems, the randomness and the state are observable and a training set based on historical or simulated data is readily available. One then approximates the objective functions of these problems by the empirical averages over this training data, constructing a loss function which is minimized over the network parameters. The minimizer or a near-minimizer is the trained network and it is an approximation of the optimal feedback action.
Nov-18-2020
- Country:
- North America > United States
- New Jersey > Mercer County
- Princeton (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- New Jersey > Mercer County
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- United Kingdom > England
- North America > United States
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Banking & Finance (0.68)
- Technology: