Well File:
- Well Planning ( results)
- Shallow Hazard Analysis ( results)
- Well Plat ( results)
- Wellbore Schematic ( results)
- Directional Survey ( results)
- Fluid Sample ( results)
- Log ( results)
- Density ( results)
- Gamma Ray ( results)
- Mud ( results)
- Resistivity ( results)
- Report ( results)
- Daily Report ( results)
- End of Well Report ( results)
- Well Completion Report ( results)
- Rock Sample ( results)
Jason D. Lee
Gradient Descent Can Take Exponential Time to Escape Saddle Points
Simon S. Du, Chi Jin, Jason D. Lee, Michael I. Jordan, Aarti Singh, Barnabas Poczos
Although gradient descent (GD) almost always escapes saddle points asymptotically [Lee et al., 2016], this paper shows that even with fairly natural random initialization schemes and non-pathological functions, GD can be significantly slowed down by saddle points, taking exponential time to escape. On the other hand, gradient descent with perturbations [Ge et al., 2015, Jin et al., 2017] is not slowed down by saddle points--it can find an approximate local minimizer in polynomial time. This result implies that GD is inherently slower than perturbed GD, and justifies the importance of adding perturbations for efficient non-convex optimization. While our focus is theoretical, we also present experiments that illustrate our theoretical findings.
Adding One Neuron Can Eliminate All Bad Local Minima
SHIYU LIANG, Ruoyu Sun, Jason D. Lee, R. Srikant
One of the main difficulties in analyzing neural networks is the non-convexity of the loss function which may have many bad local minima. In this paper, we study the landscape of neural networks for binary classification tasks. Under mild assumptions, we prove that after adding one special neuron with a skip connection to the output, or one special neuron per layer, every local minimum is a global minimum.
Implicit Bias of Gradient Descent on Linear Convolutional Networks
Suriya Gunasekar, Jason D. Lee, Daniel Soudry, Nati Srebro
Large scale neural networks used in practice are highly over-parameterized with far more trainable model parameters compared to the number of training examples. Consequently, optimization objectives for learning such high capacity models have many global minima that fit training data perfectly. However, minimizing the training loss using specific optimization algorithms take us to not just any global minima, but some special global minima, e.g., global minima minimizing some regularizer R(ฮฒ). In over-parameterized models, specially deep neural networks, much, if not most, of the inductive bias of the learned model comes from this implicit regularization from the optimization algorithm. Understanding the implicit bias, e.g., via characterizing R(ฮฒ), is thus essential for understanding how and what the model learns.
Adding One Neuron Can Eliminate All Bad Local Minima
SHIYU LIANG, Ruoyu Sun, Jason D. Lee, R. Srikant
One of the main difficulties in analyzing neural networks is the non-convexity of the loss function which may have many bad local minima. In this paper, we study the landscape of neural networks for binary classification tasks. Under mild assumptions, we prove that after adding one special neuron with a skip connection to the output, or one special neuron per layer, every local minimum is a global minimum.