gradient


Regularization techniques for Neural Networks

#artificialintelligence

In our last post, we learned about feedforward neural networks and how to design them. In this post, we will learn how to tackle one of the most central problems that arise in the domain of machine learning, that is how to make our algorithm to find a perfect fit not only to the training set but also to the testing set. When an algorithm performs well on the training set but performs poorly on the testing set, the algorithm is said to be overfitted on the Training data. After all, our main goal is to perform well on never seen before data, ie reducing the overfitting. To tackle this problem we have to make our model generalize over the training data which is done using various regularization techniques which we will learn about in this post.


Implementing Gradient Decent using Numpy matrix multiplication

#artificialintelligence

In the previous section, taught you about the log-loss function. There are many other error functions used for neural networks. Let me teach you another one, called the mean squared error. As the name says, this one is the mean of the squares of the differences between the predictions and the labels. In the following section I'll go over it in detail, then we'll get to implement backpropagation with it on the same student admissions dataset. And as a bonus, we'll be implementing this in a very effective way using matrix multiplication with NumPy! We want to find the weights for our neural networks.


Comparing Different Classification Machine Learning Models for an imbalanced dataset

#artificialintelligence

A data set is called imbalanced if it contains many more samples from one class than from the rest of the classes. Data sets are unbalanced when at least one class is represented by only a small number of training examples (called the minority class) while other classes make up the majority. In this scenario, classifiers can have good accuracy on the majority class but very poor accuracy on the minority class(es) due to the influence that the larger majority class. The common example of such dataset is credit card fraud detection, where data points for fraud 1, are usually very less in comparison to fraud 0. There are many reasons why a dataset might be imbalanced: the category one is targeting might be very rare in the population, or the data might simply be difficult to collect. Let's solve the problem of an imbalanced dataset by working on one such dataset.


Using RNNs for Machine Translation

#artificialintelligence

Stories are the essence of human culture. They contain information about our past, and theories of the future. They allow us to delve into inner workings and subtleties of the human mind, discovering aspects of it that are impossible to analyze traditionally. While stories have always been written by our species in the past, with the growth and research in fields of deep learning, we see computer programs being able to write stories, and use language as humans do, but how? When you were reading the last two paragraphs, you based your understanding of every word off of the previous sentences and words.


It's Only Natural: An Excessively Deep Dive Into Natural Gradient Optimization

#artificialintelligence

I'm going to tell a story: one you've almost certainly heard before, but with a different emphasis than you're used to. To a first (order) approximation, all modern deep learning models are trained using gradient descent. At each step of gradient descent, your parameter values begin at some starting point, and you move them in the direction of greatest loss reduction. You do this by taking the derivative of your loss with respect to your whole vector of parameters, otherwise called the Jacobian. However, this is just the first derivative of your loss, and it doesn't tell you anything about curvature, or, how quickly your first derivative is changing.


Learning from Small Data

#artificialintelligence

What traditional CNN architecture need is a large number of parameters and a huge dataset; If we supply small data to a model having a large number of parameters the model will fail to learn the patterns and thus it gets easily underfitted. So we can't use most of the state of the art architectures which are trained with millions of parameters with the large dataset (ImageNet) for training a small sample of data. If we force such models to train on small data then we may have to tackle with challenges such as vanishing gradient and exploding gradient and also the model may learn highly irrelevant/unnecessary features when we use deeper architecture for small data (noise and unstructured patterns.).


Deep Learning based Edge Detection in OpenCV - CV-Tricks.com

#artificialintelligence

In this post, we will learn how to use deep learning based edge detection in OpenCV which is more accurate than the widely popular canny edge detector. Edge detection is useful in many use-cases such as visual saliency detection, object detection, tracking and motion analysis, structure from motion, 3D reconstruction, autonomous driving, image to text analysis and many more. Edge detection is a very old problem in computer vision which involves detecting the edges in an image to determine object boundary and thus separate the object of interest. One of the most popular technique for edge detection has been Canny Edge detection which has been the go-to method for most of the computer vision researchers and practitioners. Let's have a quick look at Canny Edge Detection.


Introducing TensorFlow Privacy: Learning with Differential Privacy for Training Data

#artificialintelligence

Today, we're excited to announce TensorFlow Privacy (GitHub), an open source library that makes it easier not only for developers to train machine-learning models with privacy, but also for researchers to advance the state of the art in machine learning with strong privacy guarantees. Modern machine learning is increasingly applied to create amazing new technologies and user experiences, many of which involve training machines to learn responsibly from sensitive data, such as personal photos or email. Ideally, the parameters of trained machine-learning models should encode general patterns rather than facts about specific training examples. To ensure this, and to give strong privacy guarantees when the training data is sensitive, it is possible to use techniques based on the theory of differential privacy. In particular, when training on users' data, those techniques offer strong mathematical guarantees that models do not learn or remember the details about any specific user.


Advances in Generative Adversarial Networks – BeyondMinds – Medium

#artificialintelligence

Generative Adversarial Networks are a powerful class of neural networks with remarkable applications. They essentially consist of a system of two neural networks -- the Generator and the Discriminator -- dueling each other. Given a set of target samples, the Generator tries to produce samples that can fool the Discriminator into believing they are real. The Discriminator tries to resolve real (target) samples from fake (generated) samples. Using this iterative training approach, we eventually end up with a Generator that is really good at generating samples similar to the target samples. GANs have a plethora of applications, as they can learn to mimic data distributions of almost any kind.


Understanding Neural Networks: What, How and Why? – Towards Data Science

#artificialintelligence

Neural networks is one of the most powerful and widely used algorithms when it comes to the subfield of machine learning called deep learning. At first look, neural networks may seem a black box; an input layer gets the data into the "hidden layers" and after a magic trick we can see the information provided by the output layer. However, understanding what the hidden layers are doing is the key step to neural network implementation and optimization. In our path to understand neural networks, we are going to answer three questions: What, How and Why? The neural networks that we are going to considered are strictly called artificial neural networks, and as the name suggests, are based on what science knows about the human brain's structure and function.