AITopics | Liang, Tengyuan

Plotting

Liang, Tengyuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fisher-Rao Metric, Geometry, and Complexity of Neural Networks

Liang, Tengyuan, Poggio, Tomaso, Rakhlin, Alexander, Stokes, James

arXiv.org Machine LearningNov-5-2017

We study the relationship between geometry and capacity measures for deep neural networks from an invariance viewpoint. We introduce a new notion of capacity --- the Fisher-Rao norm --- that possesses desirable invariance properties and is motivated by Information Geometry. We discover an analytical characterization of the new capacity measure, through which we establish norm-comparison inequalities and further show that the new measure serves as an umbrella for several existing norm-based complexity measures. We discuss upper bounds on the generalization error induced by the proposed measure. Extensive numerical experiments on CIFAR-10 support our theoretical findings. Our theoretical analysis rests on a key structural lemma about partial derivatives of multi-layer rectifier networks.

deep learning, fisher-rao norm, neural network, (19 more...)

arXiv.org Machine Learning

1711.0153

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Weighted Message Passing and Minimum Energy Flow for Heterogeneous Stochastic Block Models with Side Information

Cai, T. Tony, Liang, Tengyuan, Rakhlin, Alexander

arXiv.org Machine LearningSep-12-2017

We study the misclassification error for community detection in general heterogeneous stochastic block models (SBM) with noisy or partial label information. We establish a connection between the misclassification rate and the notion of minimum energy on the local neighborhood of the SBM. We develop an optimally weighted message passing algorithm to reconstruct labels for SBM based on the minimum energy flow and the eigenvectors of a certain Markov transition matrix. The general SBM considered in this paper allows for unequal-size communities, degree heterogeneity, and different connection probabilities among blocks. We focus on how to optimally weigh the message passing to improve misclassification.

artificial intelligence, misclassification error, us government, (15 more...)

arXiv.org Machine Learning

1709.03907

Country: North America > United States (0.46)

Genre: Research Report (0.63)

Industry: Energy > Oil & Gas (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.88)
Information Technology > Architecture > Distributed Systems (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Inference via Message Passing on Partially Labeled Stochastic Block Models

Cai, T. Tony, Liang, Tengyuan, Rakhlin, Alexander

arXiv.org Machine LearningMar-22-2016

We study the community detection and recovery problem in partially-labeled stochastic block models (SBM). We develop a fast linearized message-passing algorithm to reconstruct labels for SBM (with $n$ nodes, $k$ blocks, $p,q$ intra and inter block connectivity) when $\delta$ proportion of node labels are revealed. The signal-to-noise ratio ${\sf SNR}(n,k,p,q,\delta)$ is shown to characterize the fundamental limitations of inference via local algorithms. On the one hand, when ${\sf SNR}>1$, the linearized message-passing algorithm provides the statistical inference guarantee with mis-classification rate at most $\exp(-({\sf SNR}-1)/2)$, thus interpolating smoothly between strong and weak consistency. This exponential dependence improves upon the known error rate $({\sf SNR}-1)^{-1}$ in the literature on weak recovery. On the other hand, when ${\sf SNR}<1$ (for $k=2$) and ${\sf SNR}<1/4$ (for general growing $k$), we prove that local algorithms suffer an error rate at least $\frac{1}{2} - \sqrt{\delta \cdot {\sf SNR}}$, which is only slightly better than random guess for small $\delta$.

algorithm, artificial intelligence, us government, (18 more...)

arXiv.org Machine Learning

1603.06923

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix

Cai, T. Tony, Liang, Tengyuan, Rakhlin, Alexander

arXiv.org Machine LearningOct-21-2015

The interplay between computational efficiency and statistical accuracy in high-dimensional inference has drawn increasing attention in the literature. In this paper, we study computational and statistical boundaries for submatrix localization. Given one observation of (one or multiple non-overlapping) signal submatrix (of magnitude $\lambda$ and size $k_m \times k_n$) contaminated with a noise matrix (of size $m \times n$), we establish two transition thresholds for the signal to noise $\lambda/\sigma$ ratio in terms of $m$, $n$, $k_m$, and $k_n$. The first threshold, $\sf SNR_c$, corresponds to the computational boundary. Below this threshold, it is shown that no polynomial time algorithm can succeed in identifying the submatrix, under the \textit{hidden clique hypothesis}. We introduce adaptive linear time spectral algorithms that identify the submatrix with high probability when the signal strength is above the threshold $\sf SNR_c$. The second threshold, $\sf SNR_s$, captures the statistical boundary, below which no method can succeed with probability going to one in the minimax sense. The exhaustive search method successfully finds the submatrix above this threshold. The results show an interesting phenomenon that $\sf SNR_c$ is always significantly larger than $\sf SNR_s$, which implies an essential gap between statistical optimality and computational efficiency for submatrix localization.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1502.01988

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Geometric Inference for General High-Dimensional Linear Inverse Problems

Cai, T. Tony, Liang, Tengyuan, Rakhlin, Alexander

arXiv.org Machine LearningJun-16-2015

This paper presents a unified geometric framework for the statistical analysis of a general ill-posed linear inverse model which includes as special cases noisy compressed sensing, sign vector recovery, trace regression, orthogonal matrix estimation, and noisy matrix completion. We propose computationally feasible convex programs for statistical inference including estimation, confidence intervals and hypothesis testing. A theoretical framework is developed to characterize the local estimation rate of convergence and to provide statistical inference guarantees. Our results are built based on the local conic geometry and duality. The difficulty of statistical inference is captured by the geometric characterization of the local tangent cone through the Gaussian width and Sudakov minoration estimate.

artificial intelligence, optimization problem, probability, (16 more...)

arXiv.org Machine Learning

1404.4408

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning with Square Loss: Localization through Offset Rademacher Complexity

Liang, Tengyuan, Rakhlin, Alexander, Sridharan, Karthik

arXiv.org Machine LearningJun-15-2015

We consider regression with square loss and general classes of functions without the boundedness assumption. We introduce a notion of offset Rademacher complexity that provides a transparent way to study localization both in expectation and in high probability. For any (possibly non-convex) class, the excess loss of a two-step estimator is shown to be upper bounded by this offset complexity through a novel geometric inequality. In the convex case, the estimator reduces to an empirical risk minimizer. The method recovers the results of \citep{RakSriTsy15} for the bounded case while also providing guarantees without the boundedness assumption.

artificial intelligence, complexity, machine learning, (17 more...)

arXiv.org Machine Learning

1502.06134

Country: Oceania > Australia (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On Zeroth-Order Stochastic Convex Optimization via Random Walks

Liang, Tengyuan, Narayanan, Hariharan, Rakhlin, Alexander

arXiv.org Machine LearningFeb-11-2014

We propose a method for zeroth order stochastic convex optimization that attains the suboptimality rate of $\tilde{\mathcal{O}}(n^{7}T^{-1/2})$ after $T$ queries for a convex bounded function $f:{\mathbb R}^n\to{\mathbb R}$. The method is based on a random walk (the \emph{Ball Walk}) on the epigraph of the function. The randomized approach circumvents the problem of gradient estimation, and appears to be less sensitive to noisy function evaluations compared to noiseless zeroth order methods.

algorithm, artificial intelligence, probability, (12 more...)

arXiv.org Machine Learning

1402.2667

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback