vc-dimension
The Parameterized Complexity of Computing the VC-Dimension
The VC-dimension is a well-studied and fundamental complexity measure of a set system (or hypergraph) that is central to many areas of machine learning. We establish several new results on the complexity of computing the VC-dimension. In particular, given a hypergraph H = (V,E), we prove that the naive 2O(|V|)-time algorithm is asymptotically tight under the Exponential Time Hypothesis (ETH). We then prove that the problem admits a 1-additive fixed-parameter approximation algorithm when parameterized by the maximum degree of Hand a fixed-parameter algorithm when parameterized by its dimension, and that these are essentially the only such exploitable structural parameters.
The Structural Complexity of Matrix-Vector Multiplication
We consider the problem of preprocessing an n n matrix M, and supporting queries that, for any vector v, returns the matrix-vector product Mv. This problem has been extensively studied in both theory and practice: on one side, practitioners have developed algorithms that are highly efficient in practice, whereas on the other side, theoreticians have proven that the problem cannot be solved faster than naive multiplication in the worst-case. This lower bound holds even in the average-case, implying that existing average-case analyses cannot explain this gap between theory and practice. Hence, we study the problem for structured matrices. We show that for n n Boolean matrices of VC-dimension d, the matrix-vector multiplication problem can be solved with eO(n2)preprocessing and eO(n2 1/d) query time.
The Structural Complexity of Matrix-Vector Multiplication
We consider the problem of preprocessing an $n\times n$ matrix $\mathbf{M}$, and supporting queries that, for any vector $v$, returns the matrix-vector product $\mathbf{M} v$. This problem has been extensively studied in both theory and practice: on one side, practitioners have developed algorithms that are highly efficient in practice, whereas on the other side, theoreticians have proven that the problem cannot be solved faster than naive multiplication in the worst-case. This lower bound holds even in the average-case, implying that existing average-case analyses cannot explain this gap between theory and practice. Hence, we study the problem for \emph{structured} matrices. We show that for $n\times n$ Boolean matrices of VC-dimension $d$, the matrix-vector multiplication problem can be solved with $\smash{\tilde{O}(n^2)}$ preprocessing and $\smash{\tilde O(n^{2-1/d})}$ query time.
PAC-learning in the presence of adversaries
The existence of evasion attacks during the test phase of machine learning algorithms represents a significant challenge to both their deployment and understanding. These attacks can be carried out by adding imperceptible perturbations to inputs to generate adversarial examples and finding effective defenses and detectors has proven to be difficult. In this paper, we step away from the attack-defense arms race and seek to understand the limits of what can be learned in the presence of an evasion adversary. In particular, we extend the Probably Approximately Correct (PAC)-learning framework to account for the presence of an adversary. We first define corrupted hypothesis classes which arise from standard binary hypothesis classes in the presence of an evasion adversary and derive the Vapnik-Chervonenkis (VC)-dimension for these, denoted as the adversarial VC-dimension. We then show that sample complexity upper bounds from the Fundamental Theorem of Statistical learning can be extended to the case of evasion adversaries, where the sample complexity is controlled by the adversarial VC-dimension. We then explicitly derive the adversarial VC-dimension for halfspace classifiers in the presence of a sample-wise norm-constrained adversary of the type commonly studied for evasion attacks and show that it is the same as the standard VC-dimension, closing an open question. Finally, we prove that the adversarial VC-dimension can be either larger or smaller than the standard VC-dimension depending on the hypothesis class and adversary, making it an interesting object of study in its own right.