Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning

Aug-18-2025, 04:33:59 GMT–Neural Information Processing Systems

If we perform 1000 training runs (which is not uncommon today) naively using grid search for hyper-parameter tuning, it will take 4000 GPU hours.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Aug-18-2025, 04:33:59 GMT

Conferences PDF

Country:
- North America
  - United States
    - Texas (0.04)
    - Washington > King County
      - Seattle (0.04)
    - Georgia > Fulton County
      - Atlanta (0.04)
    - California > Santa Clara County
      - Palo Alto (0.04)
  - Canada > Ontario
    - Toronto (0.14)
- Europe
  - Spain > Andalusia
    - Cádiz Province > Cadiz (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - India (0.04)
  - Middle East > Qatar
    - Ad-Dawhah > Doha (0.04)

Genre:
- Research Report (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Representation & Reasoning > Search (0.71)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
b8ab7288e7d5aefc695175f22bbddead-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found