AITopics | classier

Collaborating Authors

classier

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Extending F1 metric, probabilistic approach

Sitarz, Mikolaj

arXiv.org Artificial IntelligenceOct-26-2022

This article explores the extension of well-known F1 score used for assessing the performance of binary classifiers. We propose the new metric using probabilistic interpretation of precision, recall, specificity, and negative predictive value. We describe its properties and compare it to common metrics. Then we demonstrate its behavior in edge cases of the confusion matrix. Finally, the properties of the metric are tested on binary classifier trained on the real dataset.

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Artificial Intelligence

2210.11997

Country:

North America > United States > Wisconsin (0.04)
Europe > Poland > Lesser Poland Province > Kraków (0.04)

Genre: Research Report > Experimental Study (0.34)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Automatically Training a Problematic Dialogue Predictor for a Spoken Dialogue System

Gorin, A., Langkilde-Geary, I., Walker, M. A., Wright, J., Hastie, H. Wright

arXiv.org Artificial IntelligenceJun-9-2011

Spoken dialogue systems promise efficient and natural access to a large variety of information sources and services from any phone. However, current spoken dialogue systems are deficient in their strategies for preventing, identifying and repairing problems that arise in the conversation. This paper reports results on automatically training a Problematic Dialogue Predictor to predict problematic human-computer dialogues using a corpus of 4692 dialogues collected with the 'How May I Help You' (SM) spoken dialogue system. The Problematic Dialogue Predictor can be immediately applied to the system's decision of whether to transfer the call to a human customer care agent, or be used as a cue to the system's dialogue manager to modify its behavior to repair problems, and even perhaps, to prevent them. We show that a Problematic Dialogue Predictor using automatically-obtainable features from the first two exchanges in the dialogue can predict problematic dialogues 13.2% more accurately than the baseline.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.971

1106.1817

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Committee-Based Sample Selection for Probabilistic Classifiers

Argamon-Engelson, S., Dagan, I.

arXiv.org Artificial IntelligenceJun-1-2011

In many real-world learning tasks, it is expensive to acquire a sufficient number of labeled examples for training. This paper investigates methods for reducing annotation cost by `sample selection'. In this approach, during training the learning program examines many unlabeled examples and selects for labeling only those that are most informative at each stage. This avoids redundantly labeling examples that contribute little new information. Our work follows on previous research on Query By Committee, extending the committee-based paradigm to the context of probabilistic classification. We describe a family of empirical methods for committee-based sample selection in probabilistic classification models, which evaluate the informativeness of an example by measuring the degree of disagreement between several model variants. These variants (the committee) are drawn randomly from a probability distribution conditioned by the training set labeled so far. The method was applied to the real-world natural language processing task of stochastic part-of-speech tagging. We find that all variants of the method achieve a significant reduction in annotation cost, although their computational efficiency differs. In particular, the simplest variant, a two member committee with no parameters to tune, gives excellent results. We also show that sample selection yields a significant reduction in the size of the model used by the tagger.

machine learning, natural language, selection, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.612

1106.022

Country:

Asia > Middle East (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.66)

Add feedback

Identifying Mislabeled Training Data

Brodley, C. E., Friedl, M. A.

arXiv.org Artificial IntelligenceJun-1-2011

The goal of this approach is to improve classication accuracies produced by learning algorithms by improving the quality of the training data. Our approach uses a set of learning algorithms to create classiers that serve as noise lters for the training data. We evaluate single algorithm, majority vote and consensus lters on ve datasets that are prone to labeling errors. Our experiments illustrate that ltering signicantly improves classication accuracy for noise levels up to 30%. An analytical and empirical evaluation of the precision of our approach shows that consensus lters are conservative at throwing away good data at the expense of retaining bad data and that majority lters are better at detecting bad data at the expense of throwing away good data. This suggests that for situations in which there is a paucity of data, consensus lters are preferable, whereas majority vote lters are preferable for situations with an abundance of data. 1. Introducti The maximum accuracy achievable depends on the quality of the data and on the appropriateness of the chosen learning algorithm for the data. The work described here focuses on improving the quality of training data by identifying and eliminating mislabeled instances prior to applying the chosen learning algorithm, thereby increasing classication accuracy. Labeling error can occur for several reasons including subjectivity, data-entry error, or inadequacy of the information used to label each object. Subjectivity may arise when observations need to be ranked in some way such as disease severity or when the information used to label an object is dierent from the information to which the learning algorithm will have access. For example, when labeling pixels in image data, the analyst typically uses visual input rather than the numeric values of the feature vector corresponding to the observation. Domains in which experts disagree are natural places for subjective labeling errors (Smyth, 1996). A third cause of labeling error arises when the information used to label each observation is inadequate. For example, in the medical domain it may not be possible to perform the tests necessary to guarantee that a diagnosis is 100% accurate. For domains in which labeling errors occur, an automated method of eliminating or correcting mislabeled observations will improve the predictive accuracy of the classier formed from the training data. In this article we address the problem of identifying training instances that are mislabeled.

artificial intelligence, classier, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.606

1106.0219

Country: North America > United States > Massachusetts (0.46)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.68)

Industry:

Energy (0.68)
Education (0.67)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback