Search
A Minimax Optimal Algorithm for Crowdsourcing
Bonald, Thomas, Combes, Richard
We consider the problem of accurately estimating the reliability of workers based on noisy labels they provide, which is a fundamental question in crowdsourcing. We propose a novel lower bound on the minimax estimation error which applies to any estimation procedure. We further propose Triangular Estimation (TE), an algorithm for estimating the reliability of workers. TE has low complexity, may be implemented in a streaming setting when labels are provided by workers in real time, and does not rely on an iterative procedure. We prove that TE is minimax optimal and matches our lower bound. We conclude by assessing the performance of TE and other state-of-the-art algorithms on both synthetic and real-world data.
Universal consistency and minimax rates for online Mondrian Forests
Mourtada, Jaouad, Gaïffas, Stéphane, Scornet, Erwan
We establish the consistency of an algorithm of Mondrian Forests~\cite{lakshminarayanan2014mondrianforests,lakshminarayanan2016mondrianuncertainty}, a randomized classification algorithm that can be implemented online. First, we amend the original Mondrian Forest algorithm proposed in~\cite{lakshminarayanan2014mondrianforests}, that considers a \emph{fixed} lifetime parameter. Indeed, the fact that this parameter is fixed actually hinders statistical consistency of the original procedure. Our modified Mondrian Forest algorithm grows trees with increasing lifetime parameters $\lambda_n$, and uses an alternative updating rule, allowing to work also in an online fashion. Second, we provide a theoretical analysis establishing simple conditions for consistency. Our theoretical analysis also exhibits a surprising fact: our algorithm achieves the minimax rate (optimal rate) for the estimation of a Lipschitz regression function, which is a strong extension of previous results~\cite{arlot2014purf_bias} to an \emph{arbitrary dimension}.
On the Optimization Landscape of Tensor Decompositions
Non-convex optimization with local search heuristics has been widely used in machine learning, achieving many state-of-art results. It becomes increasingly important to understand why they can work for these NP-hard problems on typical data. The landscape of many objective functions in learning has been conjectured to have the geometric property that ``all local optima are (approximately) global optima'', and thus they can be solved efficiently by local search algorithms. However, establishing such property can be very difficult. In this paper, we analyze the optimization landscape of the random over-complete tensor decomposition problem, which has many applications in unsupervised leaning, especially in learning latent variable models. In practice, it can be efficiently solved by gradient ascent on a non-convex objective. We show that for any small constant $\epsilon > 0$, among the set of points with function values $(1+\epsilon)$-factor larger than the expectation of the function, all the local maxima are approximate global maxima. Previously, the best-known result only characterizes the geometry in small neighborhoods around the true components. Our result implies that even with an initialization that is barely better than the random guess, the gradient ascent algorithm is guaranteed to solve this problem. Our main technique uses Kac-Rice formula and random matrix theory. To our best knowledge, this is the first time when Kac-Rice formula is successfully applied to counting the number of local minima of a highly-structured random polynomial with dependent coefficients.
Subset Selection under Noise
Qian, Chao, Shi, Jing-Cheng, Yu, Yang, Tang, Ke, Zhou, Zhi-Hua
The problem of selecting the best $k$-element subset from a universe is involved in many applications. While previous studies assumed a noise-free environment or a noisy monotone submodular objective function, this paper considers a more realistic and general situation where the evaluation of a subset is a noisy monotone function (not necessarily submodular), with both multiplicative and additive noises. To understand the impact of the noise, we firstly show the approximation ratio of the greedy algorithm and POSS, two powerful algorithms for noise-free subset selection, in the noisy environments. We then propose to incorporate a noise-aware strategy into POSS, resulting in the new PONSS algorithm. We prove that PONSS can achieve a better approximation ratio under some assumption such as i.i.d. noise distribution. The empirical results on influence maximization and sparse regression problems show the superior performance of PONSS.
Dynamic Importance Sampling for Anytime Bounds of the Partition Function
Lou, Qi, Dechter, Rina, Ihler, Alexander T.
Computing the partition function is a key inference task in many graphical models. In this paper, we propose a dynamic importance sampling scheme that provides anytime finite-sample bounds for the partition function. Our algorithm balances the advantages of the three major inference strategies, heuristic search, variational bounds, and Monte Carlo methods, blending sampling with search to refine a variationally defined proposal. Our algorithm combines and generalizes recent work on anytime search and probabilistic bounds of the partition function. By using an intelligently chosen weighted average over the samples, we construct an unbiased estimator of the partition function with strong finite-sample confidence intervals that inherit both the rapid early improvement rate of sampling and the long-term benefits of an improved proposal from search. This gives significantly improved anytime behavior, and more flexible trade-offs between memory, time, and solution quality. We demonstrate the effectiveness of our approach empirically on real-world problem instances taken from recent UAI competitions.
Learning Chordal Markov Networks via Branch and Bound
Rantanen, Kari, Hyttinen, Antti, Järvisalo, Matti
We present a new algorithmic approach for the task of finding a chordal Markov network structure that maximizes a given scoring function. The algorithm is based on branch and bound and integrates dynamic programming for both domain pruning and for obtaining strong bounds for search-space pruning. Empirically, we show that the approach dominates in terms of running times a recent integer programming approach (and thereby also a recent constraint optimization approach) for the problem.
Overview of Udacity Artificial Intelligence Engineer Nanodegree, Term 1
After finishing Udacity Deep Learning Foundation I felt that I got a good introduction to Deep Learning, but to understand things, I must dig deeper. Besides I had a guaranteed admission to Self-Driving Car Engineer, Artificial Intelligence, or Robotics Nanodegree programs. Before I turn to Udacity advanced courses, I want to mention one thing at the beginning. If I could give advice to myself, I would select another introduction course on Deep Learning -- Deep Learning Specialization by Andrew Ng. First of all, his way of mentoring is unique and he can explain complex things in most clear and understandable way.
Where's my Depth First Search Machine Learning? – Towards Data Science
My first thought after reading that sentence was: "Why didn't Amazon recommend me that book when I was buying Camille's?" Then I started reading the Three-Body Problem by Cixin Liu. In the first part of that book they mention another book called Silent Spring, which according to Three-Body, it seems to have been censored by the Cultural Revolution. Without spoiling the story, Silent Spring is a very important element inside the story, up to the point that later I realised the first part of the book is actually called Silent Spring. Inside Three-Body, the book Silent Spring is only mentioned by the characters here and there, and we realise it's important to know about it once we have gone through half the book. While it's not essential to have read Silent Spring in order to understand Three-Body, it seems like quite an interesting book to have. So while reading this I also wondered: "Why didn't Amazon recommend me that book when I was buying the Three-Body Problem?" Finally I've just started reading Nobel Prize winner Svetlana Alexievich's The Unwomanly Face of War, which is an account of the Soviet women that fought during WWII.
[R] On Monte Carlo Tree Search and Reinforcement Learning • r/MachineLearning
Fuelled by successes in Computer Go, Monte Carlo tree search (MCTS) has achieved widespread adoption within the games community. Its links to traditional reinforcement learning (RL) methods have been outlined in the past; however, the use of RL techniques within tree search has not been thoroughly studied yet. In this paper we re-examine in depth this close relation between the two fields; our goal is to improve the cross-awareness between the two communities. We show that a straightforward adaptation of RL semantics within tree search can lead to a wealth of new algorithms, for which the traditional MCTS is only one of the variants. We confirm that planning methods inspired by RL in conjunction with online search demonstrate encouraging results on several classic board games and in arcade video game competitions, where our algorithm recently ranked first. Our study promotes a unified view of learning, planning, and search.
Learning From Scratch by Thinking Fast and Slow with Deep Learning and Tree Search
According to dual process theory human reasoning consists of two different kinds of thinking. System 1 is a fast, unconscious and automatic mode of thought, also known as intuition. System 2 is a slow, conscious, explicit and rule-based mode of reasoning that is believed to be an evolutionarily recent process. When learning to complete a challenging planning task, such as playing a board game, humans exploit both processes: strong intuitions allow for more effective analytic reasoning by rapidly selecting interesting lines of play for consideration. Repeated deep study gradually improves intuitions.