Feature Selection and Junta Testing are Statistically Equivalent
Beretta, Lorenzo, Harms, Nathaniel, Koch, Caleb
For a function $f \colon \{0,1\}^n \to \{0,1\}$, the junta testing problem asks whether $f$ depends on only $k$ variables. If $f$ depends on only $k$ variables, the feature selection problem asks to find those variables. We prove that these two tasks are statistically equivalent. Specifically, we show that the ``brute-force'' algorithm, which checks for any set of $k$ variables consistent with the sample, is simultaneously sample-optimal for both problems, and the optimal sample size is \[ Θ\left(\frac 1 \varepsilon \left( \sqrt{2^k \log {n \choose k}} + \log {n \choose k}\right)\right). \]
Jul-23-2025
- Country:
- North America > United States
- Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- Genre:
- Research Report (0.40)
- Technology: