AITopics | Ravikumar, Pradeep

Plotting

Ravikumar, Pradeep

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sample Complexity of Nonparametric Semi-Supervised Learning

Dan, Chen, Leqi, Liu, Aragam, Bryon, Ravikumar, Pradeep, Xing, Eric P.

arXiv.org Machine LearningSep-9-2018

We study the sample complexity of semi-supervised learning (SSL) and introduce new assumptions based on the mismatch between a mixture model learned from unlabeled data and the true mixture model induced by the (unknown) class conditional distributions. Under these assumptions, we establish an $\Omega(K\log K)$ labeled sample complexity bound without imposing parametric assumptions, where $K$ is the number of classes. Our results suggest that even in nonparametric settings it is possible to learn a near-optimal classifier using only a few labeled samples. Unlike previous theoretical work which focuses on binary classification, we consider general multiclass classification ($K>2$), which requires solving a difficult permutation learning problem. This permutation defines a classifier whose classification error is controlled by the Wasserstein distance between mixing measures, and we provide finite-sample results characterizing the behaviour of the excess risk of this classifier. Finally, we describe three algorithms for computing these estimators based on a connection to bipartite graph matching, and perform experiments to illustrate the superiority of the MLE over the majority vote estimator.

artificial intelligence, assumption, inductive learning, (16 more...)

arXiv.org Machine Learning

1809.03073

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

On Adversarial Risk and Training

Suggala, Arun Sai, Prasad, Adarsh, Nagarajan, Vaishnavh, Ravikumar, Pradeep

arXiv.org Machine LearningJun-7-2018

In this work we formally define the notions of adversarial perturbations, adversarial risk and adversarial training and analyze their properties. Our analysis provides several interesting insights into adversarial risk, adversarial training, and their relation to the classification risk, "traditional" training. We also show that adversarial training can result in models with better classification accuracy and can result in better explainable models than traditional training. Although adversarial training is computationally expensive, our results and insights suggest that one should prefer adversarial training over traditional risk minimization for learning complex models from data.

artificial intelligence, classifier, neural network, (17 more...)

arXiv.org Machine Learning

1806.02924

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Binary Classification with Karmic, Threshold-Quasi-Concave Metrics

Yan, Bowei, Koyejo, Oluwasanmi, Zhong, Kai, Ravikumar, Pradeep

arXiv.org Machine LearningJun-2-2018

Complex performance measures, beyond the popular measure of accuracy, are increasingly being used in the context of binary classification. These complex performance measures are typically not even decomposable, that is, the loss evaluated on a batch of samples cannot typically be expressed as a sum or average of losses evaluated at individual samples, which in turn requires new theoretical and methodological developments beyond standard treatments of supervised learning. In this paper, we advance this understanding of binary classification for complex performance measures by identifying two key properties: a so-called Karmic property, and a more technical threshold-quasi-concavity property, which we show is milder than existing structural assumptions imposed on performance measures. Under these properties, we show that the Bayes optimal classifier is a threshold function of the conditional probability of positive class. We then leverage this result to come up with a computationally practical plug-in classifier, via a novel threshold estimator, and further, provide a novel statistical analysis of classification error with respect to complex performance measures.

artificial intelligence, bayesian inference, threshold, (17 more...)

arXiv.org Machine Learning

1806.0064

Country:

North America > United States > Texas (0.14)
North America > United States > Illinois (0.14)
North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.50)

Add feedback

Robust Nonparametric Regression under Huber's $\epsilon$-contamination Model

Du, Simon S., Wang, Yining, Balakrishnan, Sivaraman, Ravikumar, Pradeep, Singh, Aarti

arXiv.org Machine LearningMay-25-2018

We consider the non-parametric regression problem under Huber's $\epsilon$-contamination model, in which an $\epsilon$ fraction of observations are subject to arbitrary adversarial noise. We first show that a simple local binning median step can effectively remove the adversary noise and this median estimator is minimax optimal up to absolute constants over the H\"{o}lder function class with smoothness parameters smaller than or equal to 1. Furthermore, when the underlying function has higher smoothness, we show that using local binning median as pre-preprocessing step to remove the adversarial noise, then we can apply any non-parametric estimator on top of the medians. In particular we show local median binning followed by kernel smoothing and local polynomial regression achieve minimaxity over H\"{o}lder and Sobolev classes with arbitrary smoothness parameters. Our main proof technique is a decoupled analysis of adversary noise and stochastic noise, which can be potentially applied to other robust estimation problems. We also provide numerical results to verify the effectiveness of our proposed methods.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Machine Learning

1805.10406

Country: Asia (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

DAGs with NO TEARS: Smooth Optimization for Structure Learning

Zheng, Xun, Aragam, Bryon, Ravikumar, Pradeep, Xing, Eric P.

arXiv.org Machine LearningMar-4-2018

Estimating the structure of directed acyclic graphs (DAGs, also known as Bayesian networks) is a challenging problem since the search space of DAGs is combinatorial and scales superexponentially with the number of nodes. Existing approaches rely on various local heuristics for enforcing the acyclicity constraint and are not well-suited to general purpose optimization packages for their solution. In this paper, we introduce a fundamentally different strategy: We formulate the structure learning problem as a smooth, constrained optimization problem over real matrices that avoids this combinatorial constraint entirely. This is achieved by a novel characterization of acyclicity that is not only smooth but also exact. The resulting nonconvex, constrained program involves smooth functions whose gradients are easy to compute and only involve elementary matrix operations. By using existing black-box optimization routines, our method uses global search to find an optimal DAG and can be implemented in about 50 lines of Python and outperforms existing methods without imposing any structural constraints.

bayesian inference, graph, optimization problem, (18 more...)

arXiv.org Machine Learning

1803.01422

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Robust Estimation via Robust Gradient Estimation

Prasad, Adarsh, Suggala, Arun Sai, Balakrishnan, Sivaraman, Ravikumar, Pradeep

arXiv.org Machine LearningFeb-18-2018

We provide a new computationally-efficient class of estimators for risk minimization. We show that these estimators are robust for general statistical models: in the classical Huber epsilon-contamination model and in heavy-tailed settings. Our workhorse is a novel robust variant of gradient descent, and we provide conditions under which our gradient descent variant provides accurate estimators in a general convex risk minimization problem. We provide specific consequences of our theory for linear regression, logistic regression and for estimation of the canonical parameters in an exponential family. These results provide some of the first computationally tractable and provably robust estimators for these canonical statistical models. Finally, we study the empirical performance of our proposed methods on synthetic and real datasets, and find that our methods convincingly outperform a variety of baselines.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

1802.06485

Country:

North America > United States (0.45)
Europe (0.28)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Add feedback

D2KE: From Distance to Kernel and Embedding

Wu, Lingfei, Yen, Ian En-Hsu, Xu, Fangli, Ravikumar, Pradeep, Witbrock, Michael

arXiv.org Machine LearningFeb-15-2018

For many machine learning problem settings, particularly with structured inputs such as sequences or sets of objects, a distance measure between inputs can be specified more naturally than a feature representation. However, most standard machine models are designed for inputs with a vector feature representation. In this work, we consider the estimation of a function $f:\mathcal{X} \rightarrow \R$ based solely on a dissimilarity measure $d:\mathcal{X}\times\mathcal{X} \rightarrow \R$ between inputs. In particular, we propose a general framework to derive a family of \emph{positive definite kernels} from a given dissimilarity measure, which subsumes the widely-used \emph{representative-set method} as a special case, and relates to the well-known \emph{distance substitution kernel} in a limiting case. We show that functions in the corresponding Reproducing Kernel Hilbert Space (RKHS) are Lipschitz-continuous w.r.t. the given distance metric. We provide a tractable algorithm to estimate a function from this RKHS, and show that it enjoys better generalizability than Nearest-Neighbor estimates. Our approach draws from the literature of Random Features, but instead of deriving feature maps from an existing kernel, we construct novel kernels from a random feature map, that we specify given the distance measure. We conduct classification experiments with such disparate domains as strings, time series, and sets of vectors, where our proposed framework compares favorably to existing distance-based learning methods such as $k$-nearest-neighbors, distance-substitution kernels, pseudo-Euclidean embedding, and the representative-set method.

artificial intelligence, kernel, machine learning, (18 more...)

arXiv.org Machine Learning

1802.04956

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

Identifiability of Nonparametric Mixture Models and Bayes Optimal Clustering

Aragam, Bryon, Dan, Chen, Ravikumar, Pradeep, Xing, Eric P.

arXiv.org Machine LearningFeb-12-2018

Motivated by problems in data clustering, we establish general conditions under which families of nonparametric mixture models are identifiable by introducing a novel framework for clustering overfitted \emph{parametric} (i.e. misspecified) mixture models. These conditions generalize existing conditions in the literature, and are flexible enough to include for example mixtures of Gaussian mixtures. In contrast to the recent literature on estimating nonparametric mixtures, we allow for general nonparametric mixture components, and instead impose regularity assumptions on the underlying mixing measure. As our primary application, we apply these results to partition-based clustering, generalizing the well-known notion of a Bayes optimal partition from classical model-based clustering to nonparametric settings. Furthermore, this framework is constructive in that it yields a practical algorithm for learning identified mixtures, which is illustrated through several examples. The key conceptual device in the analysis is the convex, metric geometry of probability distributions on metric spaces and its connection to optimal transport and the Wasserstein convergence of mixing measures. The result is a flexible framework for nonparametric clustering with formal consistency guarantees.

artificial intelligence, machine learning, partition, (17 more...)

arXiv.org Machine Learning

1802.04397

Country:

North America > United States (0.14)
Europe > France (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Online Classification with Complex Metrics

Yan, Bowei, Koyejo, Oluwasanmi, Zhong, Kai, Ravikumar, Pradeep

arXiv.org Machine LearningFeb-10-2018

We present a framework and analysis of consistent binary classification for complex and non-decomposable performance metrics such as the F-measure and the Jaccard measure. The proposed framework is general, as it applies to both batch and online learning, and to both linear and non-linear models. Our work follows recent results showing that the Bayes optimal classifier for many complex metrics is given by a thresholding of the conditional probability of the positive class. This manuscript extends this thresholding characterization -- showing that the utility is strictly locally quasi-concave with respect to the threshold for a wide range of models and performance metrics. This, in turn, motivates simple normalized gradient ascent updates for threshold estimation. We present a finite-sample regret analysis for the resulting procedure. In particular, the risk for the batch case converges to the Bayes risk at the same rate as that of the underlying conditional probability estimation, and the risk of proposed online algorithm converges at a rate that depends on the conditional probability estimation risk. For instance, in the special case where the conditional probability model is logistic regression, our procedure achieves $O(\frac{1}{\sqrt{n}})$ sample complexity, both for batch and online training. Empirical evaluation shows that the proposed algorithms out-perform alternatives in practice, with comparable or better prediction performance and reduced run time for various metrics and datasets.

complex metric, online classification

arXiv.org Machine Learning

1610.07116

Genre: Research Report > New Finding (0.53)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Voting-Based System for Ethical Decision Making

Noothigattu, Ritesh (Carnegie Mellon University) | Gaikwad, Snehalkumar S. (Massachusetts Institute of Technology) | Awad, Edmond (Massachusetts Institute of Technology) | Dsouza, Sohan (Massachusetts Institute of Technology) | Rahwan, Iyad (Massachusetts Institute of Technology) | Ravikumar, Pradeep ( Carnegie Mellon University ) | Procaccia, Ariel D. ( Carnegie Mellon University )

AAAI ConferencesFeb-8-2018

We present a general approach to automating ethical decisions, drawing on machine learning and computational social choice. In a nutshell, we propose to learn a model of societal preferences, and, when faced with a specific ethical dilemma at runtime, efficiently aggregate those preferences to identify a desirable choice. We provide a concrete algorithm that instantiates our approach; some of its crucial steps are informed by a new theory of swap-dominance efficient voting rules. Finally, we implement and evaluate a system for ethical decision making in the autonomous vehicle domain, using preference data collected from 1.3 million people through the Moral Machine website.

artificial intelligence, ground transportation, permutation process, (19 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts (0.14)
Europe > United Kingdom > England (0.14)

Industry: Government (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback