ALPBench: A Benchmark for Active Learning Pipelines on Tabular Data
Margraf, Valentin, Wever, Marcel, Gilhuber, Sandra, Tavares, Gabriel Marques, Seidl, Thomas, Hüllermeier, Eyke
–arXiv.org Artificial Intelligence
In settings where only a budgeted amount of labeled data can be afforded, active learning seeks to devise query strategies for selecting the most informative data points to be labeled, aiming to enhance learning algorithms' efficiency and performance. Numerous such query strategies have been proposed and compared in the active learning literature. However, the community still lacks standardized benchmarks for comparing the performance of different query strategies. This particularly holds for the combination of query strategies with different learning algorithms into active learning pipelines and examining the impact of the learning algorithm choice. To close this gap, we propose ALPBench, which facilitates the specification, execution, and performance monitoring of active learning pipelines. It has built-in measures to ensure evaluations are done reproducibly, saving exact dataset splits and hyperparameter settings of used algorithms. In total, ALPBench consists of 86 real-world tabular classification datasets and 5 active learning settings, yielding 430 active learning problems. To demonstrate its usefulness and broad compatibility with various learning algorithms and query strategies, we conduct an exemplary study evaluating 9 query strategies paired with 8 learning algorithms in 2 different settings.
arXiv.org Artificial Intelligence
Jun-25-2024
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.14)
- North America
- United States
- Maryland > Baltimore (0.04)
- California > San Francisco County
- San Francisco (0.14)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Rhode Island > Providence County
- Providence (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Wisconsin > Dane County
- Madison (0.04)
- Colorado
- Denver County > Denver (0.04)
- Boulder County > Boulder (0.04)
- Alaska > Anchorage Municipality
- Anchorage (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Canada
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.14)
- Alberta > Census Division No. 15
- Improvement District No. 9 > Banff (0.04)
- United States
- Europe
- Portugal (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Finland > Uusimaa
- Helsinki (0.04)
- Croatia > Split-Dalmatia County
- Split (0.04)
- Asia
- Africa > Rwanda
- Oceania > Australia
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Industry:
- Education (1.00)
- Technology: