A new framework for optimal classifier design

Di Martino, Matías, Hernández, Guzman, Fiori, Marcelo, Fernández, Alicia

Sep-12-2013–arXiv.org Machine Learning

Accuracy, Recall, Precision, F-measure, Kappa, ACU [García et al. (2012)] and some other new proposed measures like Informedness and Markedness [Powers (2011)] are examples of different evaluation measures. Depending on the problem and the field of application one measure could be more suitable than another. While in the Behavioral Sciences, Specificity and Sensitivity are commonly used, in the Medical Sciences, ROC analysis is a standard for evaluation. On the other hand, in the Information Retrieval community and fraud detection, Recall, Precision and F-measure are considered appropriate measures for testing effectiveness. In a learning design strategy, the best rule for the specific application will be the one that get the optimal performance for the chosen measure. Looking for the best decision rule, in a Bayesian framework, implies to minimize the overall risk taking into account the different misclassification cost [Duda et al. (2001)]; in an equal misclassification cost problem we can find this optimal solution, with maximum accuracy, selecting the class that has the maximum a posteriori probability. However, finding a decision rule that looks for minimum error rate or maximum accuracy in an imbalanced domain gives solutions strongly biased to favor the majority class, getting poor performance. This problem is particularly important in those applications where the instances of a class (the majority one) heavily outnumber the instances of the other (the minority) class and it is costly to misclassify samples from the minority class.

artificial intelligence, classifier, machine learning, (19 more...)

arXiv.org Machine Learning

Sep-12-2013

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.64)

Industry:
- Energy (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Performance Analysis > Accuracy (1.00)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found