Goto

Collaborating Authors

objective


Policy Optimizations: TRPO/PPO

#artificialintelligence

In this post, I will be talking about policy optimization methods from the papers Trust Region Policy Optimization (Schulman et al. 2015) and Proximal Policy Optimization Algorithms (Schulman et al. 2017). I will then briefly go over the Trust Region Policy Optimization method and two types of Proximal Policy Optimization methods: adaptive KL (Kullback-Leibler) penalties to the surrogate objective and clipped surrogate objective. In a traditional policy gradient method, we sample a trajectory of states, actions, and rewards, then update the policy using the sampled trajectories. While this method is great and solves basic control problems, the algorithm tends to be unstable and is inconsistent in solving an environment. A problem is that as we are updating the policy, the distribution of the inputs and outputs of the approximated policy distribution will change, resulting in instability.


Demystified: Wasserstein GANs (WGAN)

#artificialintelligence

In this article we will read about Wasserstein GANs. Specifically we will focus on the following: i) What is Wasserstein distance?, ii) Why use it? The Wasserstein distance (Earth Mover's distance) is a distance metric between two probability distributions on a given metric space. Intuitively, it can be seen as the minimum work needed to transform one distribution to another, where work is defined as the product of mass of the distribution that has to be moved and the distance to be moved. The Wasserstein distance is then the cost of the optimal transport plan.


Policy Optimizations: TRPO/PPO

#artificialintelligence

In this post, I will be talking about policy optimization methods from the papers Trust Region Policy Optimization (Schulman et al. 2015) and Proximal Policy Optimization Algorithms (Schulman et al. 2017). I will then briefly go over the Trust Region Policy Optimization method and two types of Proximal Policy Optimization methods: adaptive KL (Kullback-Leibler) penalties to the surrogate objective and clipped surrogate objective. In a traditional policy gradient method, we sample a trajectory of states, actions, and rewards, then update the policy using the sampled trajectories. While this method is great and solves basic control problems, the algorithm tends to be unstable and is inconsistent in solving an environment. A problem is that as we are updating the policy, the distribution of the inputs and outputs of the approximated policy distribution will change, resulting in instability.


Artificially Intelligent

#artificialintelligence

My kids are smart, well at least they think they are. Sure, they are learning things and being exposed to increasingly more at a younger age than any previous generation. With information swirling all around them and technology providing the conduit to never leave an inquiring mind bored, their database-driven brains are constantly stockpiling facts and stats. Notice that I close my words very carefully above - my kids are smart, not yet intelligent. Reciting facts and miscellaneous statistics are useful, but they are just the building blocks of true intelligence.


Going deeper into Deep Learning

#artificialintelligence

I'd love the thank my friends who gave me permission to use their handsome faces in the name of artificial intelligence science! We can definitely tell that this fine gentleman has brown eyes. On the other hand, the model is pretty certain that this individual has blue eyes with a probability greater than 90%. We have a correct prediction but a not very confident probability of 69% (and that's no coincidence). Finally, we try it on me… not so handsome and no so great prediction confidence.


Beginners Guide to Boltzmann Machine

#artificialintelligence

Deep learning implements structured machine learning algorithms by making use of artificial neural networks. These algorithms help the machine to learn by itself and develop the ability to establish new parameters with which help to make and execute decisions. Deep learning is considered to be a subset of machine learning and utilizes multi-layered artificial neural networks to carry out its processes, which enables it to deliver high accuracy in tasks such as speech recognition, object detection, language translation and other such modern use cases being implemented every day. One of the most intriguing implementations in the domain of artificial intelligence for creating deep learning models has been the Boltzmann Machine. In this article, we will try to understand what exactly a Boltzmann Machine is, how it can be implemented and its uses.


Complete Step-by-Step Gradient Descent Algorithm from Scratch

#artificialintelligence

If you've been studying machine learning long enough, you've probably heard terms such as SGD or Adam. They are two of many optimization algorithms. Optimization algorithms are the heart of machine learning which are responsible for the intricate work of machine learning models to learn from data. It turns out that optimization has been around for a long time, even outside of the machine learning realm. Investors seek to create portfolios that avoid excessive risk while achieving a high rate of return.


Dissecting My Confusion: SVM

#artificialintelligence

I don't know about you, but I personally find that in science, terminology that involves colloquialisms isn't really making things clearer, and rather causes confusion. Think about examples from biology as an example with such terms as knock-out, knock-down, knock-in, junk DNA and the like. You can find lots upon lots of similar examples if you're patient enough. Similarly, this happens even more abundantly in Computer Science and related fields, and what I would like to focus on in this article is the concept of SVM, to play with it, and examine the confusion that it has caused in me, and hopefully use it as means to clarifying the concept. First of all, SVM is an absolutely confusing term. It is really hard for me to believe that the word "machine" was used to describe this algorithm.


12 popular Google Doodle games you can still play

PCWorld

A few times each year, the Google Doodle team cranks out some high-quality games to celebrate anniversaries and special events, or to raise awareness of ongoing issues. Usually these games are meant to last anywhere from 2 to 20 minutes, though some last even an hour or more. In no particular order, here are some of the best Google Doodle games you can still play. Pac-Man is a timeless classic, so it only makes sense to kick off our list with this world-famous 1980s' arcade game. On May 21, 2010, Google released this fun Doodle to celebrate Pac-Man's 30th anniversary.


Data science is a team sport: How to choose the right players

#artificialintelligence

Building deep and ongoing data science capabilities isn't an easy process: it takes the right people, processes and technology. Finding the right people for the right roles -- as employers and job seekers alike can attest to -- is an ongoing challenge. In this special feature, ZDNet examines how advances in AI, visualization and cloud technology are shaping modern data analytics, and how businesses are addressing data governance and a potential data science skills gap. "The people part is probably the least well-understood aspect of this entire equation," John Thompson, global head of advanced analytics & AI at CSL Behring, said during a virtual panel discussion on Thursday. As the head of analytics at one of the leading international biotechnology companies, Thompson oversees data science teams that tackle a wide range of initiatives.