compression scheme
- North America > United States > District of Columbia > Washington (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany (0.04)
- (3 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
Collaborative Compressors in Distributed Mean Estimation with Limited Communication Budget
Vardhan, Harsh, Mazumdar, Arya
Distributed high dimensional mean estimation is a common aggregation routine used often in distributed optimization methods. Most of these applications call for a communication-constrained setting where vectors, whose mean is to be estimated, have to be compressed before sharing. One could independently encode and decode these to achieve compression, but that overlooks the fact that these vectors are often close to each other. To exploit these similarities, recently Suresh et al., 2022, Jhunjhunwala et al., 2021, Jiang et al, 2023, proposed multiple correlation-aware compression schemes. However, in most cases, the correlations have to be known for these schemes to work. Moreover, a theoretical analysis of graceful degradation of these correlation-aware compression schemes with increasing dissimilarity is limited to only the $\ell_2$-error in the literature. In this paper, we propose four different collaborative compression schemes that agnostically exploit the similarities among vectors in a distributed setting. Our schemes are all simple to implement and computationally efficient, while resulting in big savings in communication. The analysis of our proposed schemes show how the $\ell_2$, $\ell_\infty$ and cosine estimation error varies with the degree of similarity among vectors.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York > Monroe County > Rochester (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
Unlabelled Sample Compression Schemes for Intersection-Closed Classes and Extremal Classes
The sample compressibility of concept classes plays an important role in learning theory, as a sufficient condition for PAC learnability, and more recently as an avenue for robust generalisation in adaptive data analysis. Whether compression schemes of size $O(d)$ must necessarily exist for all classes of VC dimension $d$ is unknown, but conjectured to be true by Warmuth. Recently Chalopin, Chepoi, Moran, and Warmuth (2018) gave a beautiful unlabelled sample compression scheme of size VC dimension for all maximum classes: classes that meet the Sauer-Shelah-Perles Lemma with equality. They also offered a counterexample to compression schemes based on a promising approach known as corner peeling. In this paper we simplify and extend their proof technique to deal with so-called extremal classes of VC dimension $d$ which contain maximum classes of VC dimension $d-1$. A criterion is given which would imply that all extremal classes admit unlabelled compression schemes of size $d$. We also prove that all intersection-closed classes with VC dimension $d$ admit unlabelled compression schemes of size at most $11d$.
Impossibility Results for Grammar-Compressed Linear Algebra
To handle vast amounts of data, it is natural and popular to compress vectors and matrices. When we compress a vector from size N down to size n << N, it certainly makes it easier to store and transmit efficiently, but does it also make it easier to process? In this paper we consider lossless compression schemes, and ask if we can run our computations on the compressed data as efficiently as if the original data was that small. That is, if an operation has time complexity T(input-size), can we perform it on the compressed representation in time T(n) rather than T(N)? We consider the most basic linear algebra operations: inner product, matrix-vector multiplication, and matrix multiplication.
- Asia > Middle East > Israel (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks
We consider the problem of learning function classes computed by neural networks with various activations (e.g. ReLU or Sigmoid), a task believed to be computationally intractable in the worst-case. A major open problem is to understand the minimal assumptions under which these classes admit provably efficient algorithms. In this work we show that a natural distributional assumption corresponding to eigenvalue decay of the Gram matrix yields polynomial-time algorithms in the non-realizable setting for expressive classes of networks (e.g.
- North America > United States > Texas > Travis County > Austin (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- (4 more...)
- Asia > Middle East > Israel (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > Canada > Quebec > Montreal (0.14)
- North America > Canada > British Columbia > Vancouver (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (5 more...)