Goto

Collaborating Authors

 Generative AI


Reinforcement Learning w/ Keras OpenAI: Actor-Critic Models

#artificialintelligence

Last time in our Keras/OpenAI tutorial, we discussed a very fundamental algorithm in reinforcement learning: the DQN. The Deep Q-Network is actually a fairly new advent that arrived on the seen only a couple years back, so it is quite incredible if you were able to understand and implement this algorithm having just gotten a start in the field. As with the original post, let's take a quick moment to appreciate how incredible results we achieved are: in a continuous output space scenario and starting with absolutely no knowledge on what "winning" entails, we were able to explore our environment and "complete" the trials. Put yourself in the situation of this simulation. This would essentially be like asking you to play a game, without a rulebook or specific endgoal, and demanding you to continue to play until you win (almost seems a bit cruel).


Elon Musk's Research Venture Has Trained AI To Teach Itself

#artificialintelligence

As part of its effort to find better ways to develop and train "safe artificial general intelligence," OpenAI has been releasing its own versions of reinforcement learning algorithms. They call these OpenAI Baselines, and the most recent additions to these algorithms are two baselines that are meant to enhance machine learning performance by making it more efficient. The first is a baseline implementation called Actor Critic using Kronecker-factored Trust Region (ACKTR). Developed by researchers from the University of Toronto (UofT) and New York University (NYU), ACKTR improves on the way AI policies perform deep reinforcement learning -- learning that is accomplished only by trial and error, and obtained only through raw observation. In a paper published online, the UofT and NYU researchers used simulated robots and Atari games to test how ACKTR learns control policies.


Nonlinear Computation in Deep Linear Networks

#artificialintelligence

We've shown that deep linear networks -- as implemented using floating-point arithmetic -- are not actually linear and can perform nonlinear computation. We used evolution strategies to find parameters in linear networks that exploit this trait, letting us solve non-trivial problems. Neural networks consist of stacks of a linear layer followed by a nonlinearity like tanh or rectified linear unit. Without the nonlinearity, consecutive linear layers would be in theory mathematically equivalent to a single linear layer. So it's a surprise that floating point arithmetic is nonlinear enough to yield trainable deep networks. Numbers used by computers aren't perfect mathematical objects, but approximate representations using finite numbers of bits.


Attacking Machine Learning with Adversarial Examples

#artificialintelligence

Adversarial examples are inputs to machine learning models that an attacker has intentionally designed to cause the model to make a mistake; they're like optical illusions for machines. In this post we'll show how adversarial examples work across different mediums, and will discuss why securing systems against them can be difficult. At OpenAI, we think adversarial examples are a good aspect of security to work on because they represent a concrete problem in AI safety that can be addressed in the short term, and because fixing them is difficult enough that it requires a serious research effort. To get an idea of what adversarial examples look like, consider this demonstration from Explaining and Harnessing Adversarial Examples: starting with an image of a panda, the attacker adds a small perturbation that has been calculated to make the image be recognized as a gibbon with high confidence. An adversarial input, overlaid on a typical image, can cause a classifier to miscategorize a panda as a gibbon.


tensorflow/agents

@machinelearnbot

This project provides optimized infrastructure for reinforcement learning. It extends the OpenAI gym interface to multiple parallel environments and allows agents to be implemented in TensorFlow and perform batched computation. As a starting point, we provide BatchPPO, an optimized implementation of Proximal Policy Optimization. The algorithm to use is defined in the configuration and pendulum started here uses the included PPO implementation. Check out more pre-defined configurations in agents/scripts/configs.py.


Musk warns 'it begins' as Putin claims the AI-leading nation rules the world - AI News

#artificialintelligence

Elon Musk has issued a warning as Russian president Vladimir Putin claims the nation which leads in AI "will become the ruler of the world." Musk, co-chairman of OpenAI, has long warned of dire consequences for mishandling AI development. OpenAI itself is a non-profit research company that aims to champion promoting and developing friendly AI in a way to benefit humanity. As with any major technology advancement, however, there will undoubtedly be those which aim to weaponise it and to do so before rivals. Based on Putin's comments to Russia-based publication RT, it sounds as if the nation is among them.


How Open Source Machine Learning Is Accelerating Adoption - Disruption Hub

#artificialintelligence

As of last month Alphabet Inc.'s AI division, Google DeepMind, has open-sourced their new machine learning platform DeepMind Lab. Artificial Intelligence is the technology of the moment, constantly debated and attracting massive attention from investors. Despite warnings from influential figures including Professor Stephen Hawking, Google's decision to open up their software to other developers is part of a mass movement to advance the capabilities of AI. Facebook open sourced its own deep learning software last year, and Elon Musk's non-profit organisation OpenAI recently released Universe, an open software platform that can be used to train AI systems. So, why have Google, OpenAI and others made these platforms public, and how will this affect the adoption of Artificial Intelligence and machine learning as a whole?


Elon Musk's 'Dota 2' experiment is disrupting esports in a big way

#artificialintelligence

Elon Musk's artificial intelligence research company, OpenAI, is developing a self-learning bot for one of the most complex esports titles: 'Dota 2.' It has already become the ultimate challenge for players, but for top esports pros, it is also a major opportunity. Snoop Dogg and Martha Stewart reenact that famous'Ghost' scene and things get steamy



Elon Musk's Dota 2 AI beats the professionals at their own game

#artificialintelligence

Last week was the high point of the Dota 2 competitive year: it was the week of The International, Valve's biggest tournament. On Saturday, Team Liquid walked away with more than $10 million after defeating Newbee 3-0 in the grand final. Right now, one of the requirements to be a good Dota 2 player is that you've got to be a living, breathing human. The game does include some basic computer-controlled bots to practice against, but any seasoned player of the game should have no trouble prevailing over these bots, even on their hardest "Unfair" difficulty (though the Unfair Viper bot is a legendary jerk that's utterly miserable to play against). Last Friday, however, we got a hint of a new, altogether more threatening kind of computer-controlled player: an AI-controlled bot built by Elon Musk's OpenAI.