Goto

Collaborating Authors

 Undirected Networks


Making data science accessible - Markov Chains

@machinelearnbot

A Markov chain is a random process with the property that the next state depends only on the current state. For example: If you have the choice of red or blue twice the process would be Markovian if each time you chose the decision had nothing to do with your choice previously (see diagram below). How can Markov Chains help us? To start with we need to define some basic terminology. The changes of state within the system are called transitions, and the probabilities associated with various state-changes are called transition probabilities.


Fractional Langevin Monte Carlo: Exploring L\'{e}vy Driven Stochastic Differential Equations for Markov Chain Monte Carlo

arXiv.org Machine Learning

Along with the recent advances in scalable Markov Chain Monte Carlo methods, sampling techniques that are based on Langevin diffusions have started receiving increasing attention. These so called Langevin Monte Carlo (LMC) methods are based on diffusions driven by a Brownian motion, which gives rise to Gaussian proposal distributions in the resulting algorithms. Even though these approaches have proven successful in many applications, their performance can be limited by the light-tailed nature of the Gaussian proposals. In this study, we extend classical LMC and develop a novel Fractional LMC (FLMC) framework that is based on a family of heavy-tailed distributions, called $\alpha$-stable L\'{e}vy distributions. As opposed to classical approaches, the proposed approach can possess large jumps while targeting the correct distribution, which would be beneficial for efficient exploration of the state space. We develop novel computational methods that can scale up to large-scale problems and we provide formal convergence analysis of the proposed scheme. Our experiments support our theory: FLMC can provide superior performance in multi-modal settings, improved convergence rates, and robustness to algorithm parameters.


Markov Models: Understanding Markov Models and Unsupervised Machine Learning in Python with Real-World Applications

#artificialintelligence

Would you like to unlock the mysteries of Data Science? Are you yearning to understand how to make educated predictions on the weather, horse races, your unborn baby's facial features, or your boss's next black mood? Would you like a guide to explain these and many other "phenomenons" in clear, easy-to-understand language? If the answer is'yes' then you'll want to Download this book today! It's never been easier to make predictions and smart analysis with the use of Markov Models.


How Apple reinvigorated its AI aspirations in under a year

Engadget

At its WWDC 2017 keynote on Monday, Apple showed off the fruits of its AI research labors. We saw a Siri assistant that's smart enough to interpret your intentions, an updated Metal 2 graphics suite designed for machine learning and a Photos app that can do everything its Google rival does without an internet connection. Being at the front of the AI pack is a new position for Apple to find itself in. Despite setting off the AI arms race when it introduced Siri in 2010, Apple has long lagged behind its competitors in this field. It's amazing what a year of intense R&D can do. Well, technically, it's been three years of R&D, but Apple had a bit of trouble getting out of its own way for the first two.


Probabilistic programming 2: Markov Chains

#artificialintelligence

This is part two of a blog post on probabilistic programming. The first part of the blog can be found here. Markov chains are mathematical constructs with a wide range of applications in physics, mathematical biology, speech recognition, statistics and many others. The simplest way to think about them is considering the above animation. A person (the circle) is trying to find out where their friend lives in a neighbourhood block.


Efficient Reinforcement Learning via Initial Pure Exploration

arXiv.org Machine Learning

In several realistic situations, an interactive learning agent can practice and refine its strategy before going on to be evaluated. For instance, consider a student preparing for a series of tests. She would typically take a few practice tests to know which areas she needs to improve upon. Based of the scores she obtains in these practice tests, she would formulate a strategy for maximizing her scores in the actual tests. We treat this scenario in the context of an agent exploring a fixed-horizon episodic Markov Decision Process (MDP), where the agent can practice on the MDP for some number of episodes (not necessarily known in advance) before starting to incur regret for its actions. During practice, the agent's goal must be to maximize the probability of following an optimal policy. This is akin to the problem of Pure Exploration (PE). We extend the PE problem of Multi Armed Bandits (MAB) to MDPs and propose a Bayesian algorithm called Posterior Sampling for Pure Exploration (PSPE), which is similar to its bandit counterpart. We show that the Bayesian simple regret converges at an optimal exponential rate when using PSPE. When the agent starts being evaluated, its goal would be to minimize the cumulative regret incurred. This is akin to the problem of Reinforcement Learning (RL). The agent uses the Posterior Sampling for Reinforcement Learning algorithm (PSRL) initialized with the posteriors of the practice phase. We hypothesize that this PSPE + PSRL combination is an optimal strategy for minimizing regret in RL problems with an initial practice phase. We show empirical results which prove that having a lower simple regret at the end of the practice phase results in having lower cumulative regret during evaluation.


Anytime Monte Carlo

arXiv.org Machine Learning

A Monte Carlo algorithm typically simulates some prescribed number of samples, taking some random real time to complete the computations necessary. This work considers the converse: to impose a real-time budget on the computation, so that the number of samples simulated is random. To complicate matters, the real time taken for each simulation may depend on the sample produced, so that the samples themselves are not independent of their number, and a length bias with respect to compute time is apparent. This is especially problematic when a Markov chain Monte Carlo (MCMC) algorithm is used and the final state of the Markov chain---rather than an average over all states---is required. The length bias does not diminish with the compute budget in this case. It occurs, for example, in sequential Monte Carlo (SMC) algorithms. We propose an anytime framework to address the concern, using a continuous-time Markov jump process to study the progress of the computation in real time. We show that the length bias can be eliminated for any MCMC algorithm by using a multiple chain construction. The utility of this construction is demonstrated on a large-scale SMC-squared implementation, using four billion particles distributed across a cluster of 128 graphics processing units on the Amazon EC2 service. The anytime framework imposes a real-time budget on the MCMC move steps within SMC-squared, ensuring that all processors are simultaneously ready for the resampling step, demonstrably reducing wait times and providing substantial control over the total compute budget.


Elements of machine learning

@machinelearnbot

The official title of this free book available in PDF format is Machine Learning Cheat Sheet. See table of content screenshot below. The chapters 17 to 28 (the most interesting ones in my opinion) seem like a work in progress - I'm sure the authors intend to make them a bit bigger. For a more modern and applied book, get Dr Granville's book on data science. And here's the detailed table of content:


Fast rates for online learning in Linearly Solvable Markov Decision Processes

arXiv.org Machine Learning

We study the problem of online learning in a class of Markov decision processes known as linearly solvable MDPs. In the stationary version of this problem, a learner interacts with its environment by directly controlling the state transitions, attempting to balance a fixed state-dependent cost and a certain smooth cost penalizing extreme control inputs. In the current paper, we consider an online setting where the state costs may change arbitrarily between consecutive rounds, and the learner only observes the costs at the end of each respective round. We are interested in constructing algorithms for the learner that guarantee small regret against the best stationary control policy chosen in full knowledge of the cost sequence. Our main result is showing that the smoothness of the control cost enables the simple algorithm of following the leader to achieve a regret of order $\log^2 T$ after $T$ rounds, vastly improving on the best known regret bound of order $T^{3/4}$ for this setting.


Chatbots from first principles

#artificialintelligence

Words then point to shared ideas in our minds [Gärdenfors, 2014]. 5. Language as shared convention A B Wittgenstein and his language games, Philosophical Investigations, 1958 block pillar slab beam Two people building something. Eventually turns into a shared convention for a community 6. Our brains map community conventions to personal sensations and actions sensation representation action • When someone says "beam," we map that to our experience with beams. See: Benjamin Bergen, Steven Pinker, Mark Johnson, Jerome Feldman, and Murray Shanahan "beam" This is what is means for language to be grounded. We negotiate language and meaning as we go Modified from Gärdenfors (2014), which was based on Winter (1998) A fishing pole is a stick, string and hook You can catch fish with a fishing pole Get me some fish Acknowledgement Acknowledgement Break Break • Levels of discourse • Complicated to go up and down the pyramid 9. Conversation has its own rules (pragmatics) • Conversational maxims: Grice (1975, 1978) • Breaking these rules is a way to communicate more than the meaning of the words. Maxim of Quantity: Say only what is not implied. What did she mean by that?