provably strict generalisation benefit
Provably Strict Generalisation Benefit for Invariance in Kernel Methods
It is a commonly held belief that enforcing invariance improves generalisation. Although this approach enjoys widespread popularity, it is only very recently that a rigorous theoretical demonstration of this benefit has been established. In this work we build on the function space perspective of Elesedy and Zaidi [8] to derive a strictly non-zero generalisation benefit of incorporating invariance in kernel ridge regression when the target is invariant to the action of a compact group. We study invariance enforced by feature averaging and find that generalisation is governed by a notion of effective dimension that arises from the interplay between the kernel and the group. In building towards this result, we find that the action of the group induces an orthogonal decomposition of both the reproducing kernel Hilbert space and its kernel, which may be of interest in its own right.
Provably Strict Generalisation Benefit for Invariance in Kernel Methods
It is a commonly held belief that enforcing invariance improves generalisation. Although this approach enjoys widespread popularity, it is only very recently that a rigorous theoretical demonstration of this benefit has been established. In this work we build on the function space perspective of Elesedy and Zaidi [8] to derive a strictly non-zero generalisation benefit of incorporating invariance in kernel ridge regression when the target is invariant to the action of a compact group. We study invariance enforced by feature averaging and find that generalisation is governed by a notion of effective dimension that arises from the interplay between the kernel and the group. In building towards this result, we find that the action of the group induces an orthogonal decomposition of both the reproducing kernel Hilbert space and its kernel, which may be of interest in its own right.
Trustworthy Machine Learning under Social and Adversarial Data Sources
Machine learning has witnessed remarkable breakthroughs in recent years. As machine learning permeates various aspects of daily life, individuals and organizations increasingly interact with these systems, exhibiting a wide range of social and adversarial behaviors. These behaviors may have a notable impact on the behavior and performance of machine learning systems. Specifically, during these interactions, data may be generated by strategic individuals, collected by self-interested data collectors, possibly poisoned by adversarial attackers, and used to create predictors, models, and policies satisfying multiple objectives. As a result, the machine learning systems' outputs might degrade, such as the susceptibility of deep neural networks to adversarial examples (Shafahi et al., 2018; Szegedy et al., 2013) and the diminished performance of classic algorithms in the presence of strategic individuals (Ahmadi et al., 2021). Addressing these challenges is imperative for the success of machine learning in societal settings.
- Europe > Kosovo > District of Gjilan > Kamenica (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (15 more...)
- Workflow (0.92)
- Research Report > New Finding (0.92)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Government (1.00)
- Education > Educational Setting (1.00)
- (4 more...)
Provably Strict Generalisation Benefit for Equivariant Models
Elesedy, Bryn, Zaidi, Sheheryar
It is widely believed that engineering a model to be invariant/equivariant improves generalisation. Despite the growing popularity of this approach, a precise characterisation of the generalisation benefit is lacking. By considering the simplest case of linear models, this paper provides the first provably non-zero improvement in generalisation for invariant/equivariant models when the target distribution is invariant/equivariant with respect to a compact group. Moreover, our work reveals an interesting relationship between generalisation, the number of training examples and properties of the group action. Our results rest on an observation of the structure of function spaces under averaging operators which, along with its consequences for feature averaging, may be of independent interest.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York (0.04)