data sample
Breaking Data Symmetry is Needed For Generalization in Feature Learning Kernels
Bernal, Marcel Tomàs, Mallinar, Neil Rohit, Belkin, Mikhail
Grokking occurs when a model achieves high training accuracy but generalization to unseen test points happens long after that. This phenomenon was initially observed on a class of algebraic problems, such as learning modular arithmetic (Power et al., 2022). We study grokking on algebraic tasks in a class of feature learning kernels via the Recursive Feature Machine (RFM) algorithm (Radhakrishnan et al., 2024), which iteratively updates feature matrices through the Average Gradient Outer Product (AGOP) of an estimator in order to learn task-relevant features. Our main experimental finding is that generalization occurs only when a certain symmetry in the training set is broken. Furthermore, we empirically show that RFM generalizes by recovering the underlying invariance group action inherent in the data. We find that the learned feature matrices encode specific elements of the invariance group, explaining the dependence of generalization on symmetry.
- North America > United States (0.28)
- Africa > Middle East > Morocco > Tanger-Tetouan-Al Hoceima Region > Tangier (0.04)
Bilevel Distance Metric Learning for Robust Image Recognition
Metric learning, aiming to learn a discriminative Mahalanobis distance matrix M that can effectively reflect the similarity between data samples, has been widely studied in various image recognition problems. Most of the existing metric learning methods input the features extracted directly from the original data in the preprocess phase. What's worse, these features usually take no consideration of the local geometrical structure of the data and the noise existed in the data, thus they may not be optimal for the subsequent metric learning task. In this paper, we integrate both feature extraction and metric learning into one joint optimization framework and propose a new bilevel distance metric learning model. Specifically, the lower level characterizes the intrinsic data structure using graph regularized sparse coefficients, while the upper level forces the data samples from the same class to be close to each other and pushes those from different classes far away. In addition, leveraging the KKT conditions and the alternating direction method (ADM), we derive an efficient algorithm to solve the proposed new model. Extensive experiments on various occluded datasets demonstrate the effectiveness and robustness of our method.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Pennsylvania (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (6 more...)
FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models Supplementary Materials 1 Dataset 1.1 Links and Preservation
The croissant metadata record is available at croissant. We chose GitHub and Google Drive respectively to store our code and dataset. Both are widely recognized as reliable data storage platforms, ensuring long-term preservation. We highly recommend downloading the raw data directly and following the provided instructions to simplify the data processing steps. Our dataset is structured as follows: the local directory contains client-specific data for local training, while all clients aggregates data from all clients for federated learning.
- North America > United States > Virginia (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Europe > Middle East > Malta > Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- North America > United States (0.04)
- Asia > China > Tianjin Province > Tianjin (0.04)
- Contests & Prizes (0.71)
- Research Report > New Finding (0.48)
- Asia > Middle East > Republic of Türkiye (0.14)
- Europe > Portugal (0.04)
- Europe > Germany (0.04)
- (35 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Media > News (1.00)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- (2 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (0.93)
- Information Technology (0.67)
- Europe > Germany (0.04)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- North America > Central America (0.04)
- (15 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- Media > News (0.67)
- Education (0.67)
- Information Technology > Security & Privacy (0.46)