AITopics

In this paper, we study bootstrapping algorithms for learning from unlabeled data. The general idea in bootstrapping is to use some initial labeled data to build a (possibly partial) predictive labeling procedure; then use the labeling procedure to label more data; then use the newly labeled data to build a new predictive procedure and so on. This process can be iterated until a fixed point is reached or some other stopping criterion is met. Here we give P AC style bounds on generalization error which can be used to formally justify certain boostrapping algorithms. One well-known form of bootstrapping is the EM algorithm (Dempster, Laird and Rubin, 1977).

nullnullnull, nullnullnull and nullnullnull, nullnullnullnullnull, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Mozer, Michael C., Dodier, Robert, Colagrosso, Michael D., Guerra-Salcedo, Cesar, Wolniewicz, Richard

Prodding the ROC Curve: Constrained Optimization of Classifier Performance

When designing a two-alternative classifier, one ordinarily aims to maximize the classifier's ability to discriminate between members of the two classes. We describe a situation in a real-world business application of machine-learning prediction in which an additional constraint is placed on the nature of the solution: that the classifier achieve a specified correct acceptance or correct rejection rate (i.e., that it achieve a fixed accuracy on members of one class or the other). Our domain is predicting churn in the telecommunications industry. Churn refers to customers who switch from one service provider to another. We propose four algorithms for training a classifier subject to this domain constraint, and present results showing that each algorithm yields a reliable improvement in performance.

algorithm, classifier, cr rate, (16 more...)

Country:

North America > United States > Colorado > Boulder County > Boulder (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Telecommunications (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Viola, Paul, Jones, Michael

Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade

This paper develops a new approach for extremely fast detection in domains where the distribution of positive and negative examples is highly skewed (e.g.

adaboost, classifier, detection rate, (13 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.51)

Tsuda, Koji, Kawanabe, Motoaki, Rätsch, Gunnar, Sonnenburg, Sören, Müller, Klaus-Robert

A New Discriminative Kernel From Probabilistic Models

Recently, Jaakkola and Haussler proposed a method for constructing kernel functions from probabilistic models. Their so called "Fisher kernel" has been combined with discriminative classifiers such as SVM and applied successfully in e.g.

fisher kernel, kernel, top kernel, (13 more...)

Country:

Europe > Germany > Brandenburg > Potsdam (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)

Dasgupta, Sanjoy, Littman, Michael L., McAllester, David A.

PAC Generalization Bounds for Co-training

The rule-based bootstrapping introduced by Y arowsky, and its co-training variant by Blum and Mitchell, have met with considerable empirical success. Earlier work on the theory of co-training has been only loosely related to empirically useful co-training algorithms. Here we give a new P ACstyle bound on generalization error which justifies both the use of confidences -- partial rules and partial labeling of the unlabeled data -- and the use of an agreement-based objective function as suggested by Collins and Singer. Our bounds apply to the multiclass case, i.e., where instances are to be assigned one of

nullnullnull, nullnullnull and nullnullnull, nullnullnullnullnull, (13 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Tsuda, Koji, Kawanabe, Motoaki, Rätsch, Gunnar, Sonnenburg, Sören, Müller, Klaus-Robert

A New Discriminative Kernel From Probabilistic Models

Recently, Jaakkola and Haussler proposed a method for constructing kernelfunctions from probabilistic models. Their so called "Fisher kernel" has been combined with discriminative classifiers such as SVM and applied successfully in e.g.

artificial intelligence, kernel, machine learning, (14 more...)

Country: Europe > Germany (0.29)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)

Mozer, Michael C., Dodier, Robert, Colagrosso, Michael D., Guerra-Salcedo, Cesar, Wolniewicz, Richard

Prodding the ROC Curve: Constrained Optimization of Classifier Performance

When designing a two-alternative classifier, one ordinarily aims to maximize the classifier's ability to discriminate between members of the two classes. We describe a situation in a real-world business application of machine-learning prediction in which an additional constraint is placed on the nature of the solution: thatthe classifier achieve a specified correct acceptance or correct rejection rate (i.e., that it achieve a fixed accuracy on members of one class or the other). Our domain is predicting churn in the telecommunications industry. Churn refers to customers who switch from one service provider to another. We propose fouralgorithms for training a classifier subject to this domain constraint, and present results showing that each algorithm yields a reliable improvement in performance.

artificial intelligence, classifier, machine learning, (19 more...)

Country: North America > United States > Colorado (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Telecommunications (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Viola, Paul, Jones, Michael

Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade

This paper develops a new approach for extremely fast detection in domains wherethe distribution of positive and negative examples is highly skewed (e.g.

artificial intelligence, classifier, machine learning, (16 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.51)

Reducing multiclass to binary by coupling probability estimates

Zadrozny, B.

Although these two approaches are the most obvious, Allwein et al. [Allwein et a1., 2000]

artificial intelligence, machine learning, probability estimate, (17 more...)

Country: North America > United States > California > San Diego County (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.30)

Dasgupta, Sanjoy, Littman, Michael L., McAllester, David A.

PAC Generalization Bounds for Co-training

The rule-based bootstrapping introduced by Yarowsky, and its cotraining variantby Blum and Mitchell, have met with considerable empirical success. Earlier work on the theory of co-training has been only loosely related to empirically useful co-training algorithms. Here we give a new PACstyle bound on generalization error which justifies both the use of confidences -- partial rules and partial labeling of the unlabeled data -- and the use of an agreement-based objective function as suggested byCollins and Singer. Our bounds apply to the multiclass case, i.e., where instances are to be assigned one of labels for

algorithm, artificial intelligence, machine learning, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)