Goto

Collaborating Authors

 Reinforcement Learning


Download Free Books - Programming, Computer Science and IT - Read Online

#artificialintelligence

With Hands-On Machine Learning for Algorithmic Trading, create your own algorithmic design process to apply probabilistic machine learning approaches to trading decisions. Develop neural networks for algorithmic trading to perform time series forecasting and smart analytics. With Hands-On Blockchain with Hyperledger, write your own chaincode/smart contracts using Golang on hyperledger network. With Mastering Blockchain – Second Edition, build powerful applications using Ethereum to secure transactions and create smart contracts. Explore cryptography, mine cryptocurrencies, and solve scalability issues with this comprehensive guide.


Science at Uber: Applying Artificial Intelligence at Uber

#artificialintelligence

At Uber, we take advanced research work and use it to solve real world problems. In our Science at Uber video series, Uber employees talk about how we apply data science, artificial intelligence, machine learning, and other innovative technologies in our daily work. Zoubin Ghahramani, Chief Scientist at Uber, understands that movement requires intelligence, and draws a parallel between biological and artificial systems. His organization, Uber AI, develops artificial intelligence to advance Uber's core business needs. Research into reinforcement learning, deep learning, probabilistic modeling, and evolutionary algorithms makes Uber's products work more efficiently.


AI Could Help Data Centers Run Far More Efficiently

#artificialintelligence

A system by researchers at the Massachusetts Institute of Technology learns how to allocate data processing operations across thousands of servers most efficiently. Massachusetts Institute of Technology (MIT) researchers have created a system that automatically learns how to optimally allocate data processing workloads across thousands of servers as a means of boosting data center efficiency. The Decima scheduler leverages reinforcement learning to make scheduling decisions for specific workloads in specific server clusters. Decima tests multiple incoming workload allocation strategies across the servers to find the best trade-off between the use of computational resources and fast processing speeds. Decima's completion speed is about 20% to 30% faster than the best handwritten scheduling algorithms, the researchers say.


How Artificial Intelligence Learns Through Machine Learning Algorithms

#artificialintelligence

Artificial intelligence (AI) and machine learning (ML) solutions are taking the enterprise sector by storm. With their capability to vastly optimize operations through smart automation, machine learning algorithms are now instrumental for many online services. Artificial intelligence solutions are being gradually adopted by enterprises as they are starting to see the benefits offered by the technology. However, there are a few pitfalls to its adoption. In business intelligence settings, AI is usually used for deriving insights from large amounts of user data.


Neural Policy Gradient Methods: Global Optimality and Rates of Convergence

arXiv.org Machine Learning

Policy gradient methods with actor-critic schemes demonstrate tremendous empirical successes, especially when the actors and critics are parameterized by neural networks. However, it remains less clear whether such "neural" policy gradient methods converge to globally optimal policies and whether they even converge at all. We answer both the questions affirmatively in the overparameterized regime. In detail, we prove that neural natural policy gradient converges to a globally optimal policy at a sublinear rate. Also, we show that neural vanilla policy gradient converges sublinearly to a stationary point. Meanwhile, by relating the suboptimality of the stationary points to the representation power of neural actor and critic classes, we prove the global optimality of all stationary points under mild regularity conditions. Particularly, we show that a key to the global optimality and convergence is the "compatibility" between the actor and critic, which is ensured by sharing neural architectures and random initializations across the actor and critic. To the best of our knowledge, our analysis establishes the first global optimality and convergence guarantees for neural policy gradient methods.


Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity

arXiv.org Machine Learning

In this paper, we settle the sampling complexity of solving discounted two-player turn-based zero-sum stochastic games up to polylogarithmic factors. Given a stochastic game with discount factor $\gamma\in(0,1)$ we provide an algorithm that computes an $\epsilon$-optimal strategy with high-probability given $\tilde{O}((1 - \gamma)^{-3} \epsilon^{-2})$ samples from the transition function for each state-action-pair. Our algorithm runs in time nearly linear in the number of samples and uses space nearly linear in the number of state-action pairs. As stochastic games generalize Markov decision processes (MDPs) our runtime and sample complexities are optimal due to Azar et al (2013). We achieve our results by showing how to generalize a near-optimal Q-learning based algorithms for MDP, in particular Sidford et al (2018), to two-player strategy computation algorithms. This overcomes limitations of standard Q-learning and strategy iteration or alternating minimization based approaches and we hope will pave the way for future reinforcement learning results by facilitating the extension of MDP results to multi-agent settings with little loss.


Networked Control of Nonlinear Systems under Partial Observation Using Continuous Deep Q-Learning

arXiv.org Machine Learning

In this paper, we propose a design of a model-free networked controller for a nonlinear plant whose mathematical model is unknown. In a networked control system, the controller and plant are located away from each other and exchange data over a network, which causes network delays that may fluctuate randomly due to network routing. So, in this paper, we assume that the current network delay is not known but the maximum value of fluctuating network delays is known beforehand. Moreover, we also assume that the sensor cannot observe all state variables of the plant. Under these assumption, we apply continuous deep Q-learning to the design of the networked controller. Then, we introduce an extended state consisting of a sequence of past control inputs and outputs as inputs to the deep neural network. By simulation, it is shown that, using the extended state, the controller can learn a control policy robust to the fluctuation of the network delays under the partial observation.


Discover how machine learning can solve finance industry challenges by Jannes Klaas

#artificialintelligence

What are the different ML approaches in finance? Which approach do you prefer for mapping and resolving a problem and why? JK: Depending on the task, there are a lot of different methods. So, no single approach clearly dominates. The first question to ask here is whether you want to do supervised, unsupervised, or reinforcement learning.


EMI: Exploration with Mutual Information

#artificialintelligence

Reinforcement learning could be hard when the reward signal is sparse. In these scenarios, exploration strategy becomes essentially important: a good exploration strategy not only helps the agent to gain a faster and better understanding of the world but also makes it robust to the change of the environment. In this article, we discuss a novel exploration method, namely Exploration with Mutual Information(EMI) proposed by Kim et al. in ICML 2019. In a nutshell, EMI learns representations for both observations(states) and actions in the expectation that we can have a linear dynamics model on these representations. EMI then computes the intrinsic reward as the prediction error under the linear dynamics model.


Python Machine Learning Projects -- A DigitalOcean eBook DigitalOcean

#artificialintelligence

As machine learning is increasingly leveraged to find patterns, conduct analysis, and make decisions -- sometimes without final input from humans who may be impacted by these findings -- it is crucial to invest in bringing more stakeholders into the fold. This book of Python projects in machine learning tries to do just that: to equip the developers of today and tomorrow with tools they can use to better understand, evaluate, and shape machine learning to help ensure that it is serving us all. This book will set you up with a Python programming environment if you don't have one already, then provide you with a conceptual understanding of machine learning in the chapter "An Introduction to Machine Learning." What follows next are three Python machine learning projects. They will help you create a machine learning classifier, build a neural network to recognize handwritten digits, and give you a background in deep reinforcement learning through building a bot for Atari.