AITopics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Reducing multiclass to binary by coupling probability estimates

Zadrozny, B.

This paper presents a method for obtaining class membership probability estimates for multiclass classification problems by coupling the probability estimates produced by binary classifiers. This is an extension for arbitrary code matrices of a method due to Hastie and Tibshirani for pairwise coupling of probability estimates. Experimental results with Boosted Naive Bayes show that our method produces calibrated class membership probability estimates, while having similar classification accuracy as loss-based decoding, a method for obtaining the most likely class that does not generate probability estimates.

artificial intelligence, machine learning, probability estimate, (18 more...)

Country: North America > United States > California > San Diego County (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Mozer, Michael C., Dodier, Robert, Colagrosso, Michael D., Guerra-Salcedo, Cesar, Wolniewicz, Richard

Prodding the ROC Curve: Constrained Optimization of Classifier Performance

When designing a two-alternative classifier, one ordinarily aims to maximize the classifier's ability to discriminate between members of the two classes. We describe a situation in a real-world business application of machine-learning prediction in which an additional constraint is placed on the nature of the solution: that the classifier achieve a specified correct acceptance or correct rejection rate (i.e., that it achieve a fixed accuracy on members of one class or the other). Our domain is predicting churn in the telecommunications industry. Churn refers to customers who switch from one service provider to another. We propose four algorithms for training a classifier subject to this domain constraint, and present results showing that each algorithm yields a reliable improvement in performance.

classifier, optimization problem, télécommunications, (20 more...)

Country: North America > United States > Colorado (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Telecommunications (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Reducing multiclass to binary by coupling probability estimates

Zadrozny, B.

Although these two approaches are the most obvious, Allwein et al. [Allwein et a1., 2000]

artificial intelligence, machine learning, probability estimate, (15 more...)

Country: North America > United States > California > San Diego County (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.31)

Viola, Paul, Jones, Michael

Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade

This paper develops a new approach for extremely fast detection in domains wherethe distribution of positive and negative examples is highly skewed (e.g.

Tsuda, Koji, Kawanabe, Motoaki, Rätsch, Gunnar, Sonnenburg, Sören, Müller, Klaus-Robert

A New Discriminative Kernel From Probabilistic Models

Recently, Jaakkola and Haussler proposed a method for constructing kernelfunctions from probabilistic models. Their so called "Fisher kernel" has been combined with discriminative classifiers such as SVM and applied successfully in e.g.

artificial intelligence, kernel, machine learning, (16 more...)

Country: Europe > Germany (0.29)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)

Mozer, Michael C., Dodier, Robert, Colagrosso, Michael D., Guerra-Salcedo, Cesar, Wolniewicz, Richard

Prodding the ROC Curve: Constrained Optimization of Classifier Performance

When designing a two-alternative classifier, one ordinarily aims to maximize the classifier's ability to discriminate between members of the two classes. We describe a situation in a real-world business application of machine-learning prediction in which an additional constraint is placed on the nature of the solution: thatthe classifier achieve a specified correct acceptance or correct rejection rate (i.e., that it achieve a fixed accuracy on members of one class or the other). Our domain is predicting churn in the telecommunications industry. Churn refers to customers who switch from one service provider to another. We propose fouralgorithms for training a classifier subject to this domain constraint, and present results showing that each algorithm yields a reliable improvement in performance.

classifier, optimization problem, télécommunications, (20 more...)

Country: North America > United States > Colorado (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Telecommunications (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Dasgupta, Sanjoy, Littman, Michael L., McAllester, David A.

PAC Generalization Bounds for Co-training

The rule-based bootstrapping introduced by Yarowsky, and its cotraining variantby Blum and Mitchell, have met with considerable empirical success. Earlier work on the theory of co-training has been only loosely related to empirically useful co-training algorithms. Here we give a new PACstyle bound on generalization error which justifies both the use of confidences -- partial rules and partial labeling of the unlabeled data -- and the use of an agreement-based objective function as suggested byCollins and Singer. Our bounds apply to the multiclass case, i.e., where instances are to be assigned one of labels for

artificial intelligence, evolutionary algorithm, probability, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)

Chawla, N. V., Bowyer, K. W., Hall, L. O., Kegelmeyer, W. P.

SMOTE: Synthetic Minority Over-sampling Technique

Journal of Artificial Intelligence ResearchJun-1-2002

An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of ``normal'' examples with only a small percentage of ``abnormal'' or ``interesting'' examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

artificial intelligence, machine learning, north america government, (21 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.953

AI Access Foundation

10302

Journal of Artificial Intelligence Research

Country:

Europe (0.93)
North America > United States > California > San Francisco County > San Francisco (0.14)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.69)
Energy (0.68)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Szummer, Martin, Jaakkola, Tommi

Kernel Expansions with Unlabeled Examples

Neural Information Processing SystemsDec-31-2001

Modern classification applications necessitate supplementing the few available labeled examples with unlabeled examples to improve classification performance. We present a new tractable algorithm for exploiting unlabeled examples in discriminative classification. This is achieved essentially by expanding the input vectors into longer feature vectors via both labeled and unlabeled examples. The resulting classification method can be interpreted as a discriminative kernel density estimate and is readily trained via the EM algorithm, which in this case is both discriminative and achieves the optimal solution. We provide, in addition, a purely discriminative formulation of the estimation problem by appealing to the maximum entropy framework. We demonstrate that the proposed approach requires very few labeled examples for high classification accuracy.

artificial intelligence, formulation, machine learning, (16 more...)