Well File:
- Well Planning ( results)
- Shallow Hazard Analysis ( results)
- Well Plat ( results)
- Wellbore Schematic ( results)
- Directional Survey ( results)
- Fluid Sample ( results)
- Log ( results)
- Density ( results)
- Gamma Ray ( results)
- Mud ( results)
- Resistivity ( results)
- Report ( results)
- Daily Report ( results)
- End of Well Report ( results)
- Well Completion Report ( results)
- Rock Sample ( results)
Nan Ding
Cold-Start Reinforcement Learning with Softmax Policy Gradient
Nan Ding, Radu Soricut
Policy-gradient approaches to reinforcement learning have two common and undesirable overhead procedures, namely warm-start training and sample variance reduction. In this paper, we describe a reinforcement learning method based on a softmax value function that requires neither of these procedures. Our method combines the advantages of policy-gradient methods with the efficiency and simplicity of maximum-likelihood approaches. We apply this new cold-start reinforcement learning method in training sequence generation models for structured output prediction problems.
Stochastic Gradient MCMC with Stale Gradients
Changyou Chen, Nan Ding, Chunyuan Li, Yizhe Zhang, Lawrence Carin
Stochastic gradient MCMC (SG-MCMC) has played an important role in largescale Bayesian learning, with well-developed theoretical convergence properties. In such applications of SG-MCMC, it is becoming increasingly popular to employ distributed systems, where stochastic gradients are computed based on some outdated parameters, yielding what are termed stale gradients. While stale gradients could be directly used in SG-MCMC, their impact on convergence properties has not been well studied. In this paper we develop theory to show that while the bias and MSE of an SG-MCMC algorithm depend on the staleness of stochastic gradients, its estimation variance (relative to the expected estimate, based on a prescribed number of samples) is independent of it. In a simple Bayesian distributed system with SG-MCMC, where stale gradients are computed asynchronously by a set of workers, our theory indicates a linear speedup on the decrease of estimation variance w.r.t. the number of workers.