AITopics | Narayanan, Hariharan

Collaborating Authors

Narayanan, Hariharan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Structural Risk Minimization for $C^{1,1}(\mathbb{R}^d)$ Regression

Gustafson, Adam, Hirn, Matthew, Mohammed, Kitty, Narayanan, Hariharan, Xu, Jason

arXiv.org Machine LearningMar-28-2018

One means of fitting functions to high-dimensional data is by providing smoothness constraints. Recently, the following smooth function approximation problem was proposed by \citet*{herbert2014computing}: given a finite set $E \subset \mathbb{R}^d$ and a function $f: E \rightarrow \mathbb{R}$, interpolate the given information with a function $\widehat{f} \in \dot{C}^{1, 1}(\mathbb{R}^d)$ (the class of first-order differentiable functions with Lipschitz gradients) such that $\widehat{f}(a) = f(a)$ for all $a \in E$, and the value of $\mathrm{Lip}(\nabla \widehat{f})$ is minimal. An algorithm is provided that constructs such an approximating function $\widehat{f}$ and estimates the optimal Lipschitz constant $\mathrm{Lip}(\nabla \widehat{f})$ in the noiseless setting. We address statistical aspects of reconstructing the approximating function $\widehat{f}$ from a closely-related class $C^{1, 1}(\mathbb{R}^d)$ given samples from noisy data. We observe independent and identically distributed samples $y(a) = f(a) + \xi(a)$ for $a \in E$, where $\xi(a)$ is a noise term and the set $E \subset \mathbb{R}^d$ is fixed and known. We obtain uniform bounds relating the empirical risk and true risk over the class $\mathcal{F}_{\widetilde{M}} = \{f \in C^{1, 1}(\mathbb{R}^d) \mid \mathrm{Lip}(\nabla f) \leq \widetilde{M}\}$, where the quantity $\widetilde{M}$ grows with the number of samples at a rate governed by the metric entropy of the class $C^{1, 1}(\mathbb{R}^d)$. Finally, we provide an implementation using Vaidya's algorithm, supporting our results via numerical experiments on simulated data.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1803.10884

Country: North America > United States > California (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

John's Walk

Gustafson, Adam, Narayanan, Hariharan

arXiv.org Machine LearningMar-6-2018

We present an affine-invariant random walk for drawing uniform random samples from a convex body $\mathcal{K} \subset \mathbb{R}^n$ for which the maximum volume inscribed ellipsoid, known as John's ellipsoid, may be computed. We consider a polytope $\mathcal{P} = \{x \in \mathbb{R}^n \mid Ax \leq 1\}$ where $A \in \mathbb{R}^{m \times n}$ as a special case. Our algorithm makes steps using uniform sampling from the John's ellipsoid of the symmetrization of $\mathcal{K}$ at the current point. We show that from a warm start, the random walk mixes in $\widetilde{O}(n^7)$ steps where the log factors depend only on constants determined by the warm start and error parameters (and not on the dimension or number of constraints defining the body). This sampling algorithm thus offers improvement over the affine-invariant Dikin Walk for polytopes (which mixes in $\widetilde{O}(mn)$ steps from a warm start) for applications in which $m \gg n$. Furthermore, we describe an $\widetilde{O}(mn^{\omega+1} + n^{2\omega+2})$ algorithm for finding a suitably approximate John's ellipsoid for a symmetric polytope based on Vaidya's algorithm, and show the mixing time is retained using these approximate ellipsoids (where $\omega < 2.373$ is the current value of the fast matrix multiplication constant).

artificial intelligence, ellipsoid, optimization problem, (17 more...)

arXiv.org Machine Learning

1803.02032

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Manifold Learning Using Kernel Density Estimation and Local Principal Components Analysis

Mohammed, Kitty, Narayanan, Hariharan

arXiv.org Machine LearningSep-11-2017

We consider the problem of recovering a $d-$dimensional manifold $\mathcal{M} \subset \mathbb{R}^n$ when provided with noiseless samples from $\mathcal{M}$. There are many algorithms (e.g., Isomap) that are used in practice to fit manifolds and thus reduce the dimensionality of a given data set. Ideally, the estimate $\mathcal{M}_\mathrm{put}$ of $\mathcal{M}$ should be an actual manifold of a certain smoothness; furthermore, $\mathcal{M}_\mathrm{put}$ should be arbitrarily close to $\mathcal{M}$ in Hausdorff distance given a large enough sample. Generally speaking, existing manifold learning algorithms do not meet these criteria. Fefferman, Mitter, and Narayanan (2016) have developed an algorithm whose output is provably a manifold. The key idea is to define an approximate squared-distance function (asdf) to $\mathcal{M}$. Then, $\mathcal{M}_\mathrm{put}$ is given by the set of points where the gradient of the asdf is orthogonal to the subspace spanned by the largest $n - d$ eigenvectors of the Hessian of the asdf. As long as the asdf meets certain regularity conditions, $\mathcal{M}_\mathrm{put}$ is a manifold that is arbitrarily close in Hausdorff distance to $\mathcal{M}$. In this paper, we define two asdfs that can be calculated from the data and show that they meet the required regularity conditions. The first asdf is based on kernel density estimation, and the second is based on estimation of tangent spaces using local principal components analysis.

artificial intelligence, machine learning, manifold, (16 more...)

arXiv.org Machine Learning

1709.03615

Country:

North America > United States > Washington > King County > Seattle (0.14)
Asia > India > Maharashtra (0.14)

Genre: Research Report (0.63)

Industry: Education (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.60)

Add feedback

On Zeroth-Order Stochastic Convex Optimization via Random Walks

Liang, Tengyuan, Narayanan, Hariharan, Rakhlin, Alexander

arXiv.org Machine LearningFeb-11-2014

We propose a method for zeroth order stochastic convex optimization that attains the suboptimality rate of $\tilde{\mathcal{O}}(n^{7}T^{-1/2})$ after $T$ queries for a convex bounded function $f:{\mathbb R}^n\to{\mathbb R}$. The method is based on a random walk (the \emph{Ball Walk}) on the epigraph of the function. The randomized approach circumvents the problem of gradient estimation, and appears to be less sensitive to noisy function evaluations compared to noiseless zeroth order methods.

algorithm, artificial intelligence, probability, (12 more...)

arXiv.org Machine Learning

1402.2667

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback

Efficient Sampling from Time-Varying Log-Concave Distributions

Narayanan, Hariharan, Rakhlin, Alexander

arXiv.org Machine LearningSep-23-2013

We propose a computationally efficient random walk on a convex body which rapidly mixes and closely tracks a time-varying log-concave distribution. We develop general theoretical guarantees on the required number of steps; this number can be calculated on the fly according to the distance from and the shape of the next distribution. We then illustrate the technique on several examples. Within the context of exponential families, the proposed method produces samples from a posterior distribution which is updated as data arrive in a streaming fashion. The sampling technique can be used to track time-varying truncated distributions, as well as to obtain samples from a changing mixture model, fitted in a streaming fashion to data. In the setting of linear optimization, the proposed method has oracle complexity with best known dependence on the dimension for certain geometries. In the context of online learning and repeated games, the algorithm is an efficient method for implementing no-regret mixture forecasting strategies. Remarkably, in some of these examples, only one step of the random walk is needed to track the next distribution.

artificial intelligence, machine learning, markov chain, (19 more...)

arXiv.org Machine Learning

1309.5977

Country: North America > United States > New York (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Sample Complexity of Testing the Manifold Hypothesis

Narayanan, Hariharan, Mitter, Sanjoy

Neural Information Processing SystemsDec-31-2010

The hypothesis that high dimensional data tends to lie in the vicinity of a low dimensional manifold is the basis of a collection of methodologies termed Manifold Learning. In this paper, we study statistical aspects of the question of fitting a manifold with a nearly optimal least squared error. Given upper bounds on the dimension, volume, and curvature, we show that Empirical Risk Minimization can produce a nearly optimal manifold using a number of random samples that is {\it independent} of the ambient dimension of the space in which data lie. We obtain an upper bound on the required number of samples that depends polynomially on the curvature, exponentially on the intrinsic dimension, and linearly on the intrinsic volume. For constant error, we prove a matching minimax lower bound on the sample complexity that shows that this dependence on intrinsic dimension, volume and curvature is unavoidable. Whether the known lower bound of $O(\frac{k}{\eps^2} + \frac{\log \frac{1}{\de}}{\eps^2})$ for the sample complexity of Empirical Risk minimization on $k-$means applied to data in a unit ball of arbitrary dimension is tight, has been an open question since 1997 \cite{bart2}. Here $\eps$ is the desired bound on the error and $\de$ is a bound on the probability of failure. We improve the best currently known upper bound \cite{pontil} of $O(\frac{k^2}{\eps^2} + \frac{\log \frac{1}{\de}}{\eps^2})$ to $O\left(\frac{k}{\eps^2}\left(\min\left(k, \frac{\log^4 \frac{k}{\eps}}{\eps^2}\right)\right) + \frac{\log \frac{1}{\de}}{\eps^2}\right)$. Based on these results, we devise a simple algorithm for $k-$means and another that uses a family of convex programs to fit a piecewise linear curve of a specified length to high dimensional data, where the sample complexity is independent of the ambient dimension.

artificial intelligence, dimension, upstream oil & gas, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Add feedback

Random Walk Approach to Regret Minimization

Narayanan, Hariharan, Rakhlin, Alexander

Neural Information Processing SystemsDec-31-2010

We propose a computationally efficient random walk on a convex body which rapidly mixes to a time-varying Gibbs distribution. In the setting of online convex optimization and repeated games, the algorithm yields low regret and presents a novel efficient method for implementing mixture forecasting strategies.

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts

Narayanan, Hariharan, Belkin, Mikhail, Niyogi, Partha

Neural Information Processing SystemsDec-31-2007

One of the intuitions underlying many graph-based methods for clustering and semi-supervised learning, is that class or cluster boundaries pass through areas of low probability density. In this paper we provide some formal analysis of that notion for a probability distribution. We introduce a notion of weighted boundary volume, which measures the length of the class/cluster boundary weighted by the density of the underlying probability distribution. We show that sizes of the cuts of certain commonly used data adjacency graphs converge to this continuous weighted volume of the boundary.

artificial intelligence, boundary, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Ohio (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.36)

Add feedback