Using Constraints to Discover Sparse and Alternative Subgroup Descriptions
–arXiv.org Artificial Intelligence
Subgroup-discovery methods allow users to obtain simple descriptions of interesting regions in a dataset. Using constraints in subgroup discovery can enhance interpretability even further. In this article, we focus on two types of constraints: First, we limit the number of features used in subgroup descriptions, making the latter sparse. Second, we propose the novel optimization problem of finding alternative subgroup descriptions, which cover a similar set of data objects as a given subgroup but use different features. We describe how to integrate both constraint types into heuristic subgroup-discovery methods. Further, we propose a novel Satisfiability Modulo Theories (SMT) formulation of subgroup discovery as a white-box optimization problem, which allows solver-based search for subgroups and is open to a variety of constraint types. Additionally, we prove that both constraint types lead to an NP-hard optimization problem. Finally, we employ 27 binary-classification datasets to compare heuristic and solver-based search for unconstrained and constrained subgroup discovery. We observe that heuristic search methods often yield high-quality subgroups within a short runtime, also in scenarios with constraints.
arXiv.org Artificial Intelligence
Jun-3-2024
- Country:
- Africa > Sudan (0.04)
- Asia
- Europe
- Poland > Lesser Poland Province
- Kraków (0.04)
- Czechia > Prague (0.04)
- United Kingdom
- England > Greater London
- London (0.04)
- Scotland > City of Edinburgh
- Edinburgh (0.04)
- England > Greater London
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France > Occitanie
- Hérault > Montpellier (0.04)
- Belgium
- Brussels-Capital Region > Brussels (0.04)
- Flanders > East Flanders
- Ghent (0.04)
- Wallonia > Walloon Brabant
- Louvain-la-Neuve (0.04)
- Spain
- Germany
- Baden-Württemberg > Karlsruhe Region
- Karlsruhe (0.04)
- Bavaria > Lower Franconia
- Würzburg (0.04)
- Berlin (0.04)
- Baden-Württemberg > Karlsruhe Region
- Hungary > Budapest
- Budapest (0.04)
- Portugal > Porto
- Porto (0.04)
- Slovenia > Upper Carniola
- Municipality of Bled > Bled (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- Poland > Lesser Poland Province
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- District of Columbia > Washington (0.04)
- Pennsylvania
- Allegheny County > Pittsburgh (0.04)
- Philadelphia County > Philadelphia (0.04)
- Washington > King County
- Seattle (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- Texas > El Paso County
- El Paso (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Florida > Volusia County
- Daytona Beach (0.04)
- Canada > Quebec
- South America
- Brazil > Rio de Janeiro
- Rio de Janeiro (0.04)
- Chile > Santiago Metropolitan Region
- Santiago Province > Santiago (0.04)
- Brazil > Rio de Janeiro
- Genre:
- Research Report > New Finding (0.67)
- Technology: