Supplementary Material
–Neural Information Processing Systems
For a vector x 2 Rd and H [d], we denote vH to denote the vector that is equal to v on i 2 H, and zero otherwise. For a real-valued random variable X and m 2 N, we use kXkLm to denote (E|X|m)1/m. For a set S Rd and a function f, we also define the set function notation f(S) as {f(x)|x 2 S}. A.1 Finding a stable subset from a stable weighted subset For a set S on npoints, we define n, as the set of weights w 2 Rn such that wi 2 [0,1/((1)n]for all i 2 [n]and P i wi =1 . For a fixed vector µ 2 Rd that will be clear from context, a set of npoints S = {x1,...,x n}, and weights w 2 n, over S, we use w to denote P i wi(xi µ)(xi µ)>. The goal of this section is to show Proposition A.1, which states that if we have a weight w over S such that w (with respect to some vector µ) has bounded Xk norm proportional to 2 for some > 0, then there must exists some large subset S0 S that is stable with respect to µ and . Let S be a set of n points in Rd. Suppose that there exists a w 2 n, such that k wkXk B 2 for some vector µ. Then there exists a subset S0 S such that (i)|S0| (1 2)n and (ii) S0 is (,,k)-stable with respect to µ and, where = O( p B +1). Observe that k wkXk B 2 implies k w 2IkXk (B +1) 2 by the triangle inequality. In order to show Proposition A.1, we show Lemma A.2, which is a weakening of Proposition A.1 where we additionally assume that µw = P i wixi is close to µ, where µ is the vector we use to define w as well as the vector that we want to find a large sample subset S0 to be stable with respect to. To use Lemma A.2, we additionally show Proposition A.4, which states that k wkXk B 2 is enough to imply that µw is close to µ.
Neural Information Processing Systems
Apr-25-2026, 01:35:23 GMT
- Technology: