AITopics | Dasgupta, Sanjoy

Collaborating Authors

Dasgupta, Sanjoy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Incremental Clustering: The Case for Extra Clusters

Ackerman, Margareta, Dasgupta, Sanjoy

Neural Information Processing SystemsDec-31-2014

The explosion in the amount of data available for analysis often necessitates a transition from batch to incremental clustering methods, which process one element at a time and typically store only a small subset of the data. In this paper, we initiate the formal analysis of incremental clustering methods focusing on the types of cluster structure that they are able to detect. We find that the incremental setting is strictly weaker than the batch model, proving that a fundamental class of cluster structures that can readily be detected in the batch setting is impossible to identify using any incremental method. Furthermore, we show how the limitations of incremental clustering can be overcome by allowing additional clusters.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Diego County (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Rates of Convergence for Nearest Neighbor Classification

Chaudhuri, Kamalika, Dasgupta, Sanjoy

arXiv.org Machine LearningJul-1-2014

Nearest neighbor methods are a popular class of nonparametric estimators with several desirable properties, such as adaptivity to different distance scales in different regions of space. Prior work on convergence rates for nearest neighbor classification has not fully reflected these subtle properties. We analyze the behavior of these estimators in metric spaces and provide finite-sample, distribution-dependent rates of convergence under minimal assumptions. As a by-product, we are able to establish the universal consistency of nearest neighbor in a broader range of data spaces than was previously known. We illustrate our upper and lower bounds by introducing smoothness classes that are customized for nearest neighbor classification.

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Machine Learning

1407.0067

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.47)

Add feedback

Consistent procedures for cluster tree estimation and pruning

Chaudhuri, Kamalika, Dasgupta, Sanjoy, Kpotufe, Samory, von Luxburg, Ulrike

arXiv.org Machine LearningJun-5-2014

For a density $f$ on ${\mathbb R}^d$, a {\it high-density cluster} is any connected component of $\{x: f(x) \geq \lambda\}$, for some $\lambda > 0$. The set of all high-density clusters forms a hierarchy called the {\it cluster tree} of $f$. We present two procedures for estimating the cluster tree given samples from $f$. The first is a robust variant of the single linkage algorithm for hierarchical clustering. The second is based on the $k$-nearest neighbor graph of the samples. We give finite-sample convergence rates for these algorithms which also imply consistency, and we derive lower bounds on the sample complexity of cluster tree estimation. Finally, we study a tree pruning procedure that guarantees, under milder conditions than usual, to remove clusters that are spurious while recovering those that are salient.

artificial intelligence, cluster tree, machine learning, (17 more...)

arXiv.org Machine Learning

1406.1546

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Add feedback

Moment-based Uniform Deviation Bounds for $k$-means and Friends

Telgarsky, Matus J., Dasgupta, Sanjoy

Neural Information Processing SystemsDec-31-2013

Suppose $k$ centers are fit to $m$ points by heuristically minimizing the $k$-means cost; what is the corresponding fit over the source distribution? This question is resolved here for distributions with $p\geq 4$ bounded moments; in particular, the difference between the sample cost and distribution cost decays with $m$ and $p$ as $m^{\min\{-1/4, -1/2+2/p\}}$. The essential technical contribution is a mechanism to uniformly control deviations in the face of unbounded parameter sets, cost functions, and source distributions. To further demonstrate this mechanism, a soft clustering variant of $k$-means cost is also considered, namely the log likelihood of a Gaussian mixture, subject to the constraint that all covariance matrices have bounded spectrum. Lastly, a rate with refined constants is provided for $k$-means instances possessing some cluster structure.

artificial intelligence, deviation, survey article, (17 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

The Fast Convergence of Incremental PCA

Balsubramani, Akshay, Dasgupta, Sanjoy, Freund, Yoav

Neural Information Processing SystemsDec-31-2013

We prove the first finite-sample convergence rates for any incremental PCA algorithm using sub-quadratic time and memory per iteration. The algorithm analyzed is Oja's learning rule, an efficient and well-known scheme for estimating the top principal component. Our analysis of this non-convex problem yields expected and high-probability convergence rates of $\tilde{O}(1/n)$ through a novel technique. We relate our guarantees to existing rates for stochastic gradient descent on strongly convex functions, and extend those results. We also include experiments which demonstrate convergence behaviors predicted by our analysis.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Moment-based Uniform Deviation Bounds for $k$-means and Friends

Telgarsky, Matus, Dasgupta, Sanjoy

arXiv.org Machine LearningNov-8-2013

artificial intelligence, probability, survey article, (17 more...)

arXiv.org Machine Learning

1311.1903

Country: Asia (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Learning Polytrees

Dasgupta, Sanjoy

arXiv.org Artificial IntelligenceJan-23-2013

We consider the task of learning the maximum-likelihood polytree from data. Our first result is a performance guarantee establishing that the optimal branching (or Chow-Liu tree), which can be computed very easily, constitutes a good approximation to the best polytree. We then show that it is not possible to do very much better, since the learning problem is NP-hard even to approximately solve within some constant factor.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

1301.6688

Country: North America > United States > California (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Two-round Variant of EM for Gaussian Mixtures

Dasgupta, Sanjoy, Schulman, Leonard

arXiv.org Machine LearningJan-16-2013

Given a set of possible models (e.g., Bayesian network structures) and a data sample, in the unsupervised model selection problem the task is to choose the most accurate model with respect to the domain joint probability distribution. In contrast to this, in supervised model selection it is a priori known that the chosen model will be used in the future for prediction tasks involving more ``focused' predictive distributions. Although focused predictive distributions can be produced from the joint probability distribution by marginalization, in practice the best model in the unsupervised sense does not necessarily perform well in supervised domains. In particular, the standard marginal likelihood score is a criterion for the unsupervised task, and, although frequently used for supervised model selection also, does not perform well in such tasks. In this paper we study the performance of the marginal likelihood score empirically in supervised Bayesian network selection tasks by using a large number of publicly available classification data sets, and compare the results to those obtained by alternative model selection criteria, including empirical crossvalidation methods, an approximation of a supervised marginal likelihood measure, and a supervised version of Dawids prequential(predictive sequential) principle.The results demonstrate that the marginal likelihood score does NOT perform well FOR supervised model selection, WHILE the best results are obtained BY using Dawids prequential r napproach.

artificial intelligence, bayesian inference, gaussian, (18 more...)

arXiv.org Machine Learning

1301.385

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Experiments with Random Projection

Dasgupta, Sanjoy

arXiv.org Machine LearningJan-16-2013

Recent theoretical work has identified random projection as a promising dimensionality reduction technique for learning mixtures of Gausians. Here we summarize these results and illustrate them by a wide variety of experiments on synthetic and real data.

artificial intelligence, gaussian, machine learning, (16 more...)

arXiv.org Machine Learning

1301.3849

Country: North America > United States > California (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

A concentration theorem for projections

Dasgupta, Sanjoy, Hsu, Daniel, Verma, Nakul

arXiv.org Machine LearningJun-27-2012

X in R^D has mean zero and finite second moments. We show that there is a precise sense in which almost all linear projections of X into R^d (for d < D) look like a scale-mixture of spherical Gaussians -- specifically, a mixture of distributions N(0, sigma^2 I_d) where the weight of the particular sigma component is P (| X |^2 = sigma^2 D). The extent of this effect depends upon the ratio of d to D, and upon a particular coefficient of eccentricity of X's distribution. We explore this result in a variety of experiments.

artificial intelligence, machine learning, projection, (19 more...)

arXiv.org Machine Learning

1206.6813

Country: North America (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback