Goto

Collaborating Authors

 Search


Minimax Rates and Efficient Algorithms for Noisy Sorting

arXiv.org Machine Learning

There has been a recent surge of interest in studying permutation-based models for ranking from pairwise comparison data. Despite being structurally richer and more robust than parametric ranking models, permutation-based models are less well understood statistically and generally lack efficient learning algorithms. In this work, we study a prototype of permutation-based ranking models, namely, the noisy sorting model. We establish the optimal rates of learning the model under two sampling procedures. Furthermore, we provide a fast algorithm to achieve near-optimal rates if the observations are sampled independently. Along the way, we discover properties of the symmetric group which are of theoretical interest.


Finding Robust Solutions to Stable Marriage

arXiv.org Artificial Intelligence

We study the notion of robustness in stable matching problems. We first define robustness by introducing (a,b)-supermatches. An (a, b)-supermatch is a stable matching in which if any a pairs break up it is possible to find another stable matching by changing the partners of those a pairs and the partners of at most b other pairs. In this context, we define the most robust stable matching as a (1, b)- supermatch where b is minimum. We first show that checking whether a given stable matching is a (1, b)-supermatch can be done in polynomial time. Next, we use this procedure to design a constraint programming model, a local search approach, and a genetic algorithm to find the most robust stable matching. Our empirical evaluation on large instances shows that local search outperforms the other approaches.


Minimax Lower Bounds for Noisy Matrix Completion Under Sparse Factor Models

arXiv.org Machine Learning

This paper examines fundamental error characteristics for a general class of matrix completion problems, where the matrix of interest is a product of two a priori unknown matrices, one of which is sparse, and the observations are noisy. Our main contributions come in the form of minimax lower bounds for the expected per-element squared error for this problem under under several common noise models. Specifically, we analyze scenarios where the corruptions are characterized by additive Gaussian noise or additive heavier-tailed (Laplace) noise, Poisson-distributed observations, and highly-quantized (e.g., one-bit) observations, as instances of our general result. Our results establish that the error bounds derived in (Soni et al., 2016) for complexity-regularized maximum likelihood estimators achieve, up to multiplicative constants and logarithmic factors, the minimax error rates in each of these noise scenarios, provided that the nominal number of observations is large enough, and the sparse factor has (on an average) at least one non-zero per column.


Artificial Intelligence (AI) application to utilize search cues in combinatorial optimization โ€“ Xiao-Feng Xie, Ph.D.

#artificialintelligence

This is a method using AI techniques to solve a case of pure mathematics applications for finding narrow admissible tuples. The original problem is formulated into a combinatorial optimization problem. In particular, we show how to exploit the local search structure to formulate the problem landscape for dramatic reductions in search space and for non-trivial elimination in search barriers, and then to realize intelligent search strategies for effectively escaping from local minima. Experimental results demonstrate that the proposed method is able to efficiently find best known solutions.


A Minimax Optimal Algorithm for Crowdsourcing

arXiv.org Machine Learning

We consider the problem of accurately estimating the reliability of workers based on noisy labels they provide, which is a fundamental question in crowdsourcing. We propose a novel lower bound on the minimax estimation error which applies to any estimation procedure. We further propose Triangular Estimation (TE), an algorithm for estimating the reliability of workers. TE has low complexity, may be implemented in a streaming setting when labels are provided by workers in real time, and does not rely on an iterative procedure. We further prove that TE is minimax optimal and matches our lower bound. We conclude by assessing the performance of TE and other state-of-the-art algorithms on both synthetic and real-world data sets.


A brief history of Google's most important local search updates

@machinelearnbot

Deciphering the Google algorithm can sometimes feel like an exercise in futility. The search engine giant has made many changes over the years, keeping digital marketers on their toes and continually moving the goalposts on SEO best practices. Google's continuous updating can hit local businesses as hard as anyone. Every tweak and modification to its algorithm could adversely impact their search ranking or even prevent them from appearing on the first page of search results for targeted queries. What makes things really tricky is the fact that Google sometimes does not telegraph the changes it makes or how they'll impact organizations.


Residual-Guided Look-Ahead in AND/OR Search for Graphical Models

Journal of Artificial Intelligence Research

We introduce the concept of local bucket error for the mini-bucket heuristics and show how it can be used to improve the power of AND/OR search for combinatorial optimization tasks in graphical models (e.g. MAP/MPE or weighted CSPs). The local bucket error illuminates how the heuristic errors are distributed in the search space, guided by the mini-bucket heuristic. We present and analyze methods for compiling the local bucket-errors (exactly and approximately) and show that they can be used to yield an effective tool for balancing look-ahead overhead during search. This can be especially instrumental when memory is restricted, accommodating the generation of only weak compiled heuristics. We illustrate the impact of the proposed schemes in an extensive empirical evaluation for both finding exact solutions and anytime suboptimal solutions.


Minimax Estimation of Bandable Precision Matrices

arXiv.org Machine Learning

The inverse covariance matrix provides considerable insight for understanding statistical models in the multivariate setting. In particular, when the distribution over variables is assumed to be multivariate normal, the sparsity pattern in the inverse covariance matrix, commonly referred to as the precision matrix, corresponds to the adjacency matrix representation of the Gauss-Markov graph, which encodes conditional independence statements between variables. Minimax results under the spectral norm have previously been established for covariance matrices, both sparse and banded, and for sparse precision matrices. We establish minimax estimation bounds for estimating banded precision matrices under the spectral norm. Our results greatly improve upon the existing bounds; in particular, we find that the minimax rate for estimating banded precision matrices matches that of estimating banded covariance matrices. The key insight in our analysis is that we are able to obtain barely-noisy estimates of $k \times k$ subblocks of the precision matrix by inverting slightly wider blocks of the empirical covariance matrix along the diagonal. Our theoretical results are complemented by experiments demonstrating the sharpness of our bounds.


On the Statistical Efficiency of Compositional Nonparametric Prediction

arXiv.org Machine Learning

In this paper, we propose a compositional nonparametric method in which a model is expressed as a labeled binary tree of $2k+1$ nodes, where each node is either a summation, a multiplication, or the application of one of the $q$ basis functions to one of the $p$ covariates. We show that in order to recover a labeled binary tree from a given dataset, the sufficient number of samples is $O(k\log(pq)+\log(k!))$, and the necessary number of samples is $\Omega(k\log (pq)-\log(k!))$. We further propose a greedy algorithm for regression in order to validate our theoretical findings through synthetic experiments.


6 ways IoT will make local search for SMBs scalable

@machinelearnbot

In an age of artificial intelligence (AI), the Internet of Things (IoT) may seem like yesterday's news, but, of all the technologies currently developing, it has the greatest potential for near-term changes that affect local search. While it remains murky how AI will benefit agencies, IoT is reaching a critical point in adoption and maturing to a stage where it provides actionable data. Or, as Brian Buntz with the Internet of Things Institute stated, "The IoT is about to shift into ludicrous mode." The growth of the IoT is spurred by decreasing costs of hardware, such as sensors, together with the ease and availability of wireless connectivity. IoT devices already outnumber smartphones by about four times, and growth is expected to accelerate further with Cisco estimates topping 50 billion devices by 2020. The amount of data generated by these devices is enormous.