Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples

Bu, Dake, Huang, Wei, Suzuki, Taiji, Cheng, Ji, Zhang, Qingfu, Xu, Zhiqiang, Wong, Hau-San

Jun-6-2024–arXiv.org Artificial Intelligence

Neural Network-based active learning (NAL) is a cost-effective data selection technique that utilizes neural networks to select and train on a small subset of samples. While existing work successfully develops various effective or theory-justified NAL algorithms, the understanding of the two commonly used query criteria of NAL: uncertainty-based and diversity-based, remains in its infancy. In this work, we try to move one step forward by offering a unified explanation for the success of both query criteria-based NAL from a feature learning view. Specifically, we consider a feature-noise data model comprising easy-to-learn or hard-to-learn features disrupted by noise, and conduct analysis over 2-layer NN-based NALs in the pool-based scenario. We provably show that both uncertainty-based and diversity-based NAL are inherently amenable to one and the same principle, i.e., striving to prioritize samples that contain yet-to-be-learned features. We further prove that this shared principle is the key to their success-achieve small test error within a small labeled set. Contrastingly, the strategy-free passive learning exhibits a large test error due to the inadequate learning of yet-to-be-learned features, necessitating resort to a significantly larger label complexity for a sufficient test error reduction. Experimental results validate our findings.

algorithm, probability, provably neural active learning succeed, (9 more...)

arXiv.org Artificial Intelligence

Jun-6-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Wisconsin > Dane County
    - Madison (0.04)
  - Illinois > Champaign County
    - Urbana (0.04)
- Europe
  - Austria > Vienna (0.14)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Middle East
    - UAE (0.04)
    - Jordan (0.04)
  - Japan > Honshū
    - Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (1.00)
  - Neural Networks > Deep Learning (0.67)
  - Learning Graphical Models > Directed Networks
    - Bayesian Learning (0.45)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found