Kernel Density Estimation for Multiclass Quantification

Moreo, Alejandro, González, Pablo, del Coz, Juan José

Jan-2-2024–arXiv.org Machine Learning

Quantification (variously called learning to quantify or class prevalence estimation) is the area of supervised machine learning concerned with estimating the percentages of instances from a population (hereafter, a bag of examples) belonging to each of the classes of interest [González et al., 2017, Esuli et al., 2023]. Quantification finds applications in many disciplines, like the social sciences, epidemiology, or market research, in which the interest lies at the aggregate level, i.e., in which inferring characteristics of the single individual (e.g., via classification, or via regression) is of little concern since knowing group-level information is all we need. Despite the fact that binary quantification (i.e., the setting in which the classes of interest are positive vs. negative) has been, by far, the most studied scenario in the quantification literature [Card and Smith, 2018, Forman, 2008, Bella et al., 2010, Esuli and Sebastiani, 2015, Hassan et al., 2020, Moreo and Sebastiani, 2021], the truth is that many of the applications of quantification naturally arise in the multiclass regime, i.e., in cases in which there are more than two mutually exclusive classes. Examples of multiclass settings are ubiquitous, and may include the allocation of human resources to different departments in a company [Forman, 2005], the analysis of different phytoplankton species that could exist in a water sample [González et al., 2019], or the analysis of the various causes of death studied in verbal autopsies [King and Lu, 2008], to name a few. A more concrete example could consist of providing answers to questions like: "What is the percentage of tweets conveying positive, neutral, and negative opinions concerning a specific hashtag?"

histogram, multiclass quantification, posterior probability, (15 more...)

arXiv.org Machine Learning

Jan-2-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Louisiana > Orleans Parish
    - New Orleans (0.04)
  - Hawaii > Honolulu County
    - Honolulu (0.04)
- Europe
  - Switzerland (0.04)
  - Spain > Asturias (0.04)
  - United Kingdom > England
    - Greater London > London (0.04)
  - Sweden > Stockholm
    - Stockholm (0.04)
  - Italy
    - Tuscany > Pisa Province
      - Pisa (0.04)
    - Piedmont > Turin Province
      - Turin (0.04)
    - Emilia-Romagna > Metropolitan City of Bologna
      - Bologna (0.04)
  - Germany > Baden-Württemberg
    - Freiburg (0.04)
  - France > Auvergne-Rhône-Alpes
    - Isère > Grenoble (0.04)

Genre:
- Research Report
  - Experimental Study (0.92)
  - New Finding (0.67)

Industry:
- Health & Medicine (1.00)

Technology:
- Information Technology
  - Data Science > Data Mining (1.00)
  - Communications (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Uncertainty (1.00)
    - Machine Learning
      - Inductive Learning (1.00)
      - Performance Analysis > Accuracy (0.93)
      - Supervised Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found