gradient


Gradients support in PyTorch

#artificialintelligence

In this article by Maxim Lapan, the author of Deep Reinforcement Learning Hands-On,we are going to discuss about gradients in PyTorch. Gradients support in tensors is one of the major changes in PyTorch 0.4.0. In previous versions, graph tracking and gradients accumulation were done in a separate, very thin class Variable, which worked as a wrapper around the tensor and automatically performed saving of the history of computations in order to be able to backpropagate. Now gradients are a built-in tensor property, which makes the API much cleaner. Gradient was originally implemented in the Caffe toolkit and then became the de-facto standard in DL libraries.



Generative Adversarial Networks -- Explained – Towards Data Science

#artificialintelligence

Deep learning has changed the way we work, compute and has made our lives a lot easier. As Andrej Karpathy mentioned it is indeed the software 2.0, as we have taught machines to figure things out themselves. There are many existing deep learning techniques which can be ascribed to its prolific success. But no major impact has been created by deep generative models, which is due to their inability to approximate intractable probabilistic computations. Ian Goodfellow was able to find a solution that could sidestep these difficulties faced by generative models and created a new ingenious model called Generative Adversarial Networks.


Machine Learning Best Algorithms: Gradient Boosting Machines (GBM)

#artificialintelligence

We'll have a main talk (30 mins) and 3 excellent lightning talks about the machine learning algorithm that usually achieves the best accuracy on structured/tabular data (e.g. in industry/business applications or in Kaggle competitions): Abstract: With all the hype about deep learning and "AI", it is not well publicized that for structured/tabular data widely encountered in business applications it is actually another machine learning algorithm, the gradient boosting machine (GBM) that most often achieves the highest accuracy in supervised learning tasks. In this talk we'll review some of the main GBM implementations available as R and Python packages such as xgboost, h2o, lightgbm etc, we'll discuss some of their main features and characteristics, and we'll see how tuning GBMs and creating ensembles of the best models can achieve the best prediction accuracy for many business problems. Bio: Szilard studied Physics in the 90s and obtained a PhD by using statistical methods to analyze the risk of financial portfolios. He worked in finance, then more than a decade ago moved to become the Chief Scientist of a tech company in Santa Monica doing everything data (analysis, modeling, data visualization, machine learning, data infrastructure etc). He is the founder/organizer of several meetups in the Los Angeles area (R, data science etc) and the data science community website datascience.la.


Deep Learning Best Practices – Weight Initialization

#artificialintelligence

There are lots of small best practices, ranging from simple tricks like initializing weights, regularization to slightly complex techniques like cyclic learning rates that can make training and debugging neural nets easier and efficient. This inspired me to write this series of blogs where I will cover as many nuances as I can to make implementing deep learning simpler for you. While writing this blog, the assumption is that you have a basic idea of how neural networks are trained. An understanding of weights, biases, hidden layers, activations and activation functions will make the content clearer. I would recommend this course if you wish to build a basic foundation of deep learning.


How to Do Distributed Deep Learning for Object Detection Using Horovod on Azure

#artificialintelligence

This post is co-authored by Mary Wahl, Data Scientist, Xiaoyong Zhu, Program Manager, Siyu Yang, Software Development Engineer, and Wee Hyong Tok, Principal Data Scientist Manager, at Microsoft. Object detection powers some of the most widely adopted computer vision applications, from people counting in crowd control to pedestrian detection used by self-driving cars. Training an object detection model can take up to weeks on a single GPU, a prohibitively long time for experimenting with hyperparameters and model architectures. This blog will show how you can train an object detection model by distributing deep learning training to multiple GPUs. These GPUs can be on a single machine or several machines.


A Comprehensive Guide to Ensemble Learning (with Python codes) - Analytics Vidhya

#artificialintelligence

When you want to purchase a new car, will you walk up to the first car shop and purchase one based on the advice of the dealer? You would likely browser a few web portals where people have posted their reviews and compare different car models, checking for their features and prices. You will also probably ask your friends and colleagues for their opinion. In short, you wouldn't directly reach a conclusion, but will instead make a decision considering the opinions of other people as well. Ensemble models in machine learning operate on a similar idea. They combine the decisions from multiple models to improve the overall performance.


Deep Learning Book: Chapter 8-- Optimization For Training Deep Models Part I

#artificialintelligence

Although this might look relatively similar to optimization, there are two main problems. Firstly, ERM is prone to overfitting with the possibility of the dataset being learned by high capacity models (models with the ability to learn extremely complex functions). Secondly, ERM might not be feasible. Most optimization algorithms now are based on Gradient Descent (GD) which requires a derivative calculation and hence, may not work with certain loss functions like the 0–1 loss (as it is not differentiable). Using a SLF might even turn out to be beneficial as you can keep continuing to obtain a better test error by pushing the classes even further apart to get a more reliable classifier.


Policy Gradients in a Nutshell – Towards Data Science

#artificialintelligence

Reinforcement Learning (RL) refers to both the learning problem and the sub-field of machine learning which has lately been in the news for great reasons. RL based systems have now beaten world champions of Go, helped operate datacenters better and mastered a wide variety of Atari games. The research community is seeing many more promising results. With enough motivation, let us now take a look at the Reinforcement Learning problem. Reinforcement Learning is the most general description of the learning problem where the aim is to maximize a long-term objective.


Data-driven model for the identification of the rock type at a drilling bit

arXiv.org Machine Learning

In order to bridge the gap of more than 15m between the drilling bit and high-fidelity rock type sensors during the directional drilling, we present a novel approach for identifying rock type at the drilling bit. The approach is based on application of machine learning techniques for Measurements While Drilling (MWD) data. We demonstrate capabilities of the developed approach for distinguishing between the rock types corresponding to (1) a target oil bearing interval of a reservoir and (2) a non-productive shale layer and compare it to more traditional physics-driven approaches. The dataset includes MWD data and lithology mapping along multiple wellbores obtained by processing of Logging While Drilling (LWD) measurements from a massive drilling effort on one of the major newly developed oilfield in the North of Western Siberia. We compare various machine-learning algorithms, examine extra features coming from physical modeling of drilling mechanics, and show that the classification error can be reduced from 13.5% to 9%.