Statistical Formulas for F Measures
The F measures are very commonly used to estimate the performance of machine learning methods (see, e.g., the Wikipedia entry of F score). This paper provides simple formulas for their standard errors, probability distributions, and the related confidence intervals and sample size planning based on large data. We will first use a real data set (Stine, Foster, and Waterman 1998) to illustrate the concept of the F measures. A purchase for one of the two brands of orange juices: Citrus Hill and Minimaid, is coded respectively as Z 1 and Z 0 and modeled as a random variable. A score S summarizing the preference to the Citrus Hill brand is assigned to this purchase. This score S is also modeled as a random variable since it depends on factors such as customer loyalty and price difference, which can differ for each purchase.
Dec-29-2020
- Country:
- North America > United States > New York (0.04)
- Genre:
- Research Report > New Finding (0.50)
- Technology: