AITopics | Ravikumar, Pradeep

Plotting

Ravikumar, Pradeep

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluations and Methods for Explanation through Robustness Analysis

Hsieh, Cheng-Yu, Yeh, Chih-Kuan, Liu, Xuanqing, Ravikumar, Pradeep, Kim, Seungyeon, Kumar, Sanjiv, Hsieh, Cho-Jui

arXiv.org Machine LearningMay-31-2020

Among multiple ways of interpreting a machine learning model, measuring the importance of a set of features tied to a prediction is probably one of the most intuitive ways to explain a model. In this paper, we establish the link between a set of features to a prediction with a new evaluation criterion, robustness analysis, which measures the minimum distortion distance of adversarial perturbation. By measuring the tolerance level for an adversarial attack, we can extract a set of features that provides the most robust support for a prediction, and also can extract a set of features that contrasts the current prediction to a target class by setting a targeted adversarial attack. By applying this methodology to various prediction tasks across multiple domains, we observe the derived explanations are indeed capturing the significant feature set qualitatively and quantitatively.

deep learning, explanation, neural network, (17 more...)

arXiv.org Machine Learning

2006.00442

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Class-Weighted Classification: Trade-offs and Robust Approaches

Xu, Ziyu, Dan, Chen, Khim, Justin, Ravikumar, Pradeep

arXiv.org Machine LearningMay-26-2020

We address imbalanced classification, the problem in which a label may have low marginal probability relative to other labels, by weighting losses according to the correct class. First, we examine the convergence rates of the expected excess weighted risk of plug-in classifiers where the weighting for the plug-in classifier and the risk may be different. This leads to irreducible errors that do not converge to the weighted Bayes risk, which motivates our consideration of robust risks. We define a robust risk that minimizes risk over a set of weightings and show excess risk bounds for this problem. Finally, we show that particular choices of the weighting set leads to a special instance of conditional value at risk (CVaR) from stochastic programming, which we call label conditional value at risk (LCVaR). Additionally, we generalize this weighting to derive a new robust risk problem that we call label heterogeneous conditional value at risk (LHCVaR). Finally, we empirically demonstrate the efficacy of LCVaR and LHCVaR on improving class conditional risks.

banking & finance, classification, health & medicine, (20 more...)

arXiv.org Machine Learning

2005.12914

Genre: Research Report (0.64)

Industry:

Banking & Finance (0.68)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Learning Sparse Nonparametric DAGs

Zheng, Xun, Dan, Chen, Aragam, Bryon, Ravikumar, Pradeep, Xing, Eric P.

arXiv.org Machine LearningSep-28-2019

We develop a framework for learning sparse nonparametric directed acyclic graphs (DAGs) from data. Our approach is based on a recent algebraic characterization of DAGs that led to the first fully continuous optimization for score-based learning of DAG models parametrized by a linear structural equation model (SEM). We extend this algebraic characterization to nonparametric SEM by leveraging nonparametric sparsity based on partial derivatives, resulting in a continuous optimization problem that can be applied to a variety of nonparametric and semiparametric models including GLMs, additive noise models, and index models as special cases. We also explore the use of neural networks and orthogonal basis expansions to model nonlinearities for general nonparametric models. Extensive empirical study confirms the necessity of nonlinear dependency and the advantage of continuous optimization for score-based learning.

additive model, neural network, optimization problem, (18 more...)

arXiv.org Machine Learning

1909.13189

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(2 more...)

Add feedback

A Unified Approach to Robust Mean Estimation

Prasad, Adarsh, Balakrishnan, Sivaraman, Ravikumar, Pradeep

arXiv.org Artificial IntelligenceJul-1-2019

Modern data sets that arise in various branches of science and engineering are characterized by their ever increasing scale and richness. This is spurred in part by easier access to computer, internet and various sensor-based technologies that enable the collection of such varied datasets. But on the flip side, these large and rich data-sets are no longer carefully curated, are often collected in a decentralized, distributed fashion, and consequently are plagued with the complexities of heterogeneity, adversarial manipulations, and outliers. The analysis of these huge datasets is thus fraught with methodological challenges. To understand the fundamental challenges and tradeoffs in handling such "dirty data" is precisely the premise of the field of robust statistics. Here, the aforementioned complexities are largely formalized under two different models of robustness: (1) The heavy-tailed model: Here the sampling distribution can have thick tails, for instance, only low-order moments of the distribution are assumed to be finite; and (2) The ǫ-contamination model: Here the sampling distribution is modeled as a well-behaved distribution contaminated by an ǫ fraction of arbitrary outliers. In each case, classical estimators of the distribution (based for instance on the maximum likelihood estimator) can behave considerably worse (potentially arbitrarily worse) than under standard settings where the data is better behaved, satisfying various regularity properties. In particular, these classical estimators can be extremely sensitive to the tails of the distribution or to the outliers, so that the broad goal in robust statistics is to construct estimators that improve on these classical estimators by reducing their sensitivity to outliers.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1907.00927

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Adaptive Hard Thresholding for Near-optimal Consistent Robust Regression

Suggala, Arun Sai, Bhatia, Kush, Ravikumar, Pradeep, Jain, Prateek

arXiv.org Machine LearningMar-19-2019

We study the problem of robust linear regression with response variable corruptions. We consider the oblivious adversary model, where the adversary corrupts a fraction of the responses in complete ignorance of the data. We provide a nearly linear time estimator which consistently estimates the true regression vector, even with $1-o(1)$ fraction of corruptions. Existing results in this setting either don't guarantee consistent estimates or can only handle a small fraction of corruptions. We also extend our estimator to robust sparse linear regression and show that similar guarantees hold in this setting. Finally, we apply our estimator to the problem of linear regression with heavy-tailed noise and show that our estimator consistently estimates the regression vector even when the noise has unbounded variance (e.g., Cauchy distribution), for which most existing results don't even apply. Our estimator is based on a novel variant of outlier removal via hard thresholding in which the threshold is chosen adaptively and crucially relies on randomness to escape bad fixed points of the non-convex hard thresholding operation.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Machine Learning

1903.08192

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

How Sensitive are Sensitivity-Based Explanations?

Yeh, Chih-Kuan, Hsieh, Cheng-Yu, Suggala, Arun Sai, Inouye, David, Ravikumar, Pradeep

arXiv.org Machine LearningJan-27-2019

We propose a simple objective evaluation measure for explanations of a complex black-box machine learning model. While most such model explanations have largely been evaluated via qualitative measures, such as how humans might qualitatively perceive the explanations, it is vital to also consider objective measures such as the one we propose in this paper. Our evaluation measure that we naturally call sensitivity is simple: it characterizes how an explanation changes as we vary the test input, and depending on how we measure these changes, and how we vary the input, we arrive at different notions of sensitivity. We also provide a calculus for deriving sensitivity of complex explanations in terms of that for simpler explanations, which thus allows an easy computation of sensitivities for yet to be proposed explanations. One advantage of an objective evaluation measure is that we can optimize the explanation with respect to the measure: we show that (1) any given explanation can be simply modified to improve its sensitivity with just a modest deviation from the original explanation, and (2) gradient based explanations of an adversarially trained network are less sensitive. Perhaps surprisingly, our experiments show that explanations optimized to have lower sensitivity can be more faithful to the model predictions.

deep learning, explanation, neural network, (17 more...)

arXiv.org Machine Learning

1901.09392

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Towards Aggregating Weighted Feature Attributions

Bhatt, Umang, Ravikumar, Pradeep, Moura, Jose M. F.

arXiv.org Machine LearningJan-20-2019

Current approaches for explaining machine learning models fall into two distinct classes: antecedent event influence and value attribution. The former leverages training instances to describe how much influence a training point exerts on a test point, while the latter attempts to attribute value to the features most pertinent to a given prediction. In this work, we discuss an algorithm, AVA: Aggregate Valuation of Antecedents, that fuses these two explanation classes to form a new approach to feature attribution that not only retrieves local explanations but also captures global patterns learned by a model. Our experimentation convincingly favors weighting and aggregating feature attributions via AVA.

attribution, health & medicine, inductive learning, (21 more...)

arXiv.org Machine Learning

1901.1004

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.34)

Add feedback

A Voting-Based System for Ethical Decision Making

Noothigattu, Ritesh, Gaikwad, Snehalkumar 'Neil' S., Awad, Edmond, Dsouza, Sohan, Rahwan, Iyad, Ravikumar, Pradeep, Procaccia, Ariel D.

arXiv.org Artificial IntelligenceDec-18-2018

The problem of ethical decision making, which has long been a grand challenge for AI [23], has recently caught the public imagination. Perhaps its best-known manifestation is a modern variant of the classic trolley problem [10]: An autonomous vehicle has a brake failure, leading to an accident with inevitably tragic consequences; due to the vehicle's superior perception and computation capabilities, it can make an informed decision. Should it stay its course and hit a wall, killing its three passengers, one of whom is a young girl? Or swerve and kill a male athlete and his dog, who are crossing the street on a red light? A notable paper by Bonnefon et al. [2] has shed some light on how people address such questions, and even former US President Barack Obama has weighed in.

ground transportation, permutation process, us government, (20 more...)

arXiv.org Artificial Intelligence

1709.06692

Country:

North America > United States > Massachusetts (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Word Mover's Embedding: From Word2Vec to Document Embedding

Wu, Lingfei, Yen, Ian E. H., Xu, Kun, Xu, Fangli, Balakrishnan, Avinash, Chen, Pin-Yu, Ravikumar, Pradeep, Witbrock, Michael J.

arXiv.org Artificial IntelligenceOct-30-2018

While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings. Recent work has demonstrated that a distance measure between documents called \emph{Word Mover's Distance} (WMD) that aligns semantically similar words, yields unprecedented KNN classification accuracy. However, WMD is expensive to compute, and it is hard to extend its use beyond a KNN classifier. In this paper, we propose the \emph{Word Mover's Embedding } (WME), a novel approach to building an unsupervised document (sentence) embedding from pre-trained word embeddings. In our experiments on 9 benchmark text classification datasets and 22 textual similarity tasks, the proposed technique consistently matches or outperforms state-of-the-art techniques, with significantly higher accuracy on problems of short length.

deep learning, neural network, representation, (21 more...)

arXiv.org Artificial Intelligence

1811.01713

Country: North America > United States (0.14)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Learning Tensor Latent Features

Chang, Sung-En, Zheng, Xun, Yen, Ian E. H., Ravikumar, Pradeep, Yu, Rose

arXiv.org Machine LearningOct-10-2018

Compared to the classic latent factor models [14], latent feature models have two main benefits: (1) interpretablity: the binary codes directly reveal whether certain features exist in the data, thus provide more interpretable latent profiles [25]; (2) scalability: compared with real-valued codes, binary codes require fewer bits to store, thereby cutting down the memory footprint, making it easier to deploy into memory constrained environments such as mobile devices. Tensor latent feature models generalize traditional matrix latent feature models to represent high-order correlation structures in the data. For example, in spatiotemporal recommender systems, the observations are user activities over different locations and time. We want to learn the latent features and codes that correspond to user, space and time simultaneously without assuming conditional independence of these three dimensions. In this case, we can first represent such data as a high-order tensor and assign binary codes encoding presence or absence of rows for individual modes of the tensor. These codes can then help us answer the "when" and "where" questions regarding the learned user preferences. Besides the non-convex formulation of the maximum likelihood estimation (MLE) objective, learning latent feature models is further complicated by the combinatorial nature of the codes.

artificial intelligence, latent feature model, machine learning, (16 more...)

arXiv.org Machine Learning

1810.04754

Country: North America > United States > New York (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Add feedback