AITopics | class conditional probability

Collaborating Authors

class conditional probability

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Naive Bayes classifier: How it works

#artificialintelligenceJan-25-2022, 08:31:07 GMT

Classification algorithms try to predict the class or the label of the categorical target variable. A categorical variable typically represents qualitative data that has discrete values, such as pass/fail or low/medium/high, etc. Out of the many classification algorithms, the Naïve Bayes classifier is one of the simplest classification algorithms. The Naïve Bayes classifier is often used with large text datasets among other applications. The aim of this article is to explain how the Naive Bayes algorithm works.

class conditional probability, conditional probability, probability, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Understanding how to explain predictions with "explanation vectors"

#artificialintelligenceJan-21-2019, 14:28:29 GMT

In a recent post I introduced three existing approaches to explain individual predictions of any machine learning model. After the posts focused on LIME and Shapley values, now it's the turn of Explanation vectors, a method presented by David Baehrens, Timon Schroeter and Stefan Harmeling in 2010. As we have seen in the mentioned posts, explaining a decision of a black box model implies understanding what input features made the model give its prediction for the observation being explained. Intuitively, a feature has a lot of influence on the model decision if small variations in its value cause large variations of the model's output, while a feature has little influence on the prediction if big changes in that variable barely affect the model's output. Since a model is a scalar function, its gradient points in the direction of the greatest rate of increase of the model's output, so it can be used as a measure of features' influence.

artificial intelligence, classifier, machine learning, (18 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Multi-category Angle-based Classifier Refit

Yau, Guo Xian, Zhang, Chong

arXiv.org Machine LearningJul-19-2016

Classification is an important statistical learning tool. In real application, besides high prediction accuracy, it is often desirable to estimate class conditional probabilities for new observations. For traditional problems where the number of observations is large, there exist many well developed approaches. Recently, high dimensional low sample size problems are becoming increasingly popular. Margin-based classifiers, such as logistic regression, are well established methods in the literature. On the other hand, in terms of probability estimation, it is known that for binary classifiers, the commonly used methods tend to under-estimate the norm of the classification function. This can lead to biased probability estimation. Remedy approaches have been proposed in the literature. However, for the simultaneous multicategory classification framework, much less work has been done. We fill the gap in this paper. In particular, we give theoretical insights on why heavy regularization terms are often needed in high dimensional applications, and how this can lead to bias in probability estimation. To overcome this difficulty, we propose a new refit strategy for multicategory angle-based classifiers. Our new method only adds a small computation cost to the problem, and is able to attain prediction accuracy that is as good as the regular margin-based classifiers. On the other hand, the improvement of probability estimation can be very significant. Numerical results suggest that the new refit approach is highly competitive.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Machine Learning

1607.05709

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.69)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Add feedback

Active Learning with Distributional Estimates

Roeder, Jens, Nadler, Boaz, Kunzmann, Kevin, Hamprecht, Fred A.

arXiv.org Machine LearningOct-16-2012

Active Learning (AL) is increasingly important in a broad range of applications. Two main AL principles to obtain accurate classification with few labeled data are refinement of the current decision boundary and exploration of poorly sampled regions. In this paper we derive a novel AL scheme that balances these two principles in a natural way. In contrast to many AL strategies, which are based on an estimated class conditional probability ^p(y|x), a key component of our approach is to view this quantity as a random variable, hence explicitly considering the uncertainty in its estimated value. Our main contribution is a novel mathematical framework for uncertainty-based AL, and a corresponding AL scheme, where the uncertainty in ^p(y|x) is modeled by a second-order distribution. On the practical side, we show how to approximate such second-order distributions for kernel density classification. Finally, we find that over a large number of UCI, USPS and Caltech4 datasets, our AL scheme achieves significantly better learning curves than popular AL methods such as uncertainty sampling and error reduction sampling, when all use the same kernel density classifier.

artificial intelligence, classifier, machine learning, (15 more...)

arXiv.org Machine Learning

1210.4909

Country: North America > United States (0.49)

Genre: Research Report (0.83)

Industry: Government (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

An Adaptive Metric Machine for Pattern Classification

Domeniconi, Carlotta, Peng, Jing, Gunopulos, Dimitrios

Neural Information Processing SystemsDec-31-2001

Nearest neighbor classification assumes locally constant class conditional probabilities. This assumption becomes invalid in high dimensions with finite samples due to the curse of dimensionality. Severe bias can be introduced under these conditions when using the nearest neighbor rule. We propose a locally adaptive nearest neighbor classification method to try to minimize bias. We use a Chi-squared distance analysis to compute a flexible metric for producing neighborhoods that are elongated along less relevant feature dimensions and constricted along most influential ones. As a result, the class conditional probabilities tend to be smoother in the modified neighborhoods, whereby better classification performance can be achieved. The efficacy of our method is validated and compared against other techniques using a variety of real world data. 1 Introduction

error rate, neighborhood, probability, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Oklahoma > Payne County > Stillwater (0.14)
North America > United States > California > Riverside County > Riverside (0.14)
North America > United States > South Carolina > Beaufort County > Hilton Head Island (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

An Adaptive Metric Machine for Pattern Classification

Domeniconi, Carlotta, Peng, Jing, Gunopulos, Dimitrios

Neural Information Processing SystemsDec-31-2001

error rate, neighborhood, probability, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Oklahoma > Payne County > Stillwater (0.14)
North America > United States > California > Riverside County > Riverside (0.14)
North America > United States > South Carolina > Beaufort County > Hilton Head Island (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

An Adaptive Metric Machine for Pattern Classification

Domeniconi, Carlotta, Peng, Jing, Gunopulos, Dimitrios

Neural Information Processing SystemsDec-31-2001

Nearest neighbor classification assumes locally constant class conditional probabilities.This assumption becomes invalid in high dimensions with finite samples due to the curse of dimensionality. Severe bias can be introduced under these conditions when using the nearest neighbor rule. We propose a locally adaptive nearest neighbor classification method to try to minimize bias. We use a Chi-squared distance analysis to compute a flexible metric for producing neighborhoodsthat are elongated along less relevant feature dimensions and constricted along most influential ones. As a result, the class conditional probabilities tend to be smoother in the modified neighborhoods,whereby better classification performance can be achieved. The efficacy of our method is validated and compared against other techniques using a variety of real world data. 1 Introduction

artificial intelligence, machine learning, neighborhood, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Oklahoma > Payne County > Stillwater (0.14)
North America > United States > California > Riverside County > Riverside (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback