AITopics | alignment coefficient

Collaborating Authors

alignment coefficient

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Quantifying the Importance of Data Alignment in Downstream Model Performance

Chawla, Krrish, Sahai, Aryan, DePavia, Mario, Sundar, Sudharsan, Miranda, Brando

arXiv.org Artificial IntelligenceJan-14-2025

Contrary to the conventional emphasis on dataset size, we explore the role of data alignment -- an often overlooked aspect of data quality -- in training capable Large Language Models (LLMs). To do so, we use the Task2Vec-based alignment coefficient, a quantitative measure of the similarity between two datasets, to quantify the impact of alignment between training data and evaluation data on downstream performance. In particular, we conduct controlled \textit{interventional} experiments for two settings: 1. the impact of increased alignment coefficients between various pre-training (pt) against evaluation datasets, and 2. the impact of increased alignment coefficients between domain specific fine-tuning (ft) against domain specific evaluation. The domain specific task we explore is Autoformalization -- the machine translation task between natural language and code for formal verification. In both settings, we find a strong, predictable negative correlation between the alignment coefficient of a model's training and evaluation data and the model's loss/perplexity on the respective downstream task. These findings suggest a re-evaluation of LLM training approaches, demonstrating the relevance of data alignment compared to data quantity, especially in specialized downstream tasks such as Autoformalization.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.08496

Country: North America > United States > California > Santa Clara County (0.15)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Over-the-Air Federated Averaging with Limited Power and Privacy Budgets

Yan, Na, Wang, Kezhi, Pan, Cunhua, Chai, Kok Keong, Shu, Feng, Wang, Jiangzhou

arXiv.org Artificial IntelligenceMay-5-2023

To jointly overcome the communication bottleneck and privacy leakage of wireless federated learning (FL), this paper studies a differentially private over-the-air federated averaging (DP-OTA-FedAvg) system with a limited sum power budget. With DP-OTA-FedAvg, the gradients are aligned by an alignment coefficient and aggregated over the air, and channel noise is employed to protect privacy. We aim to improve the learning performance by jointly designing the device scheduling, alignment coefficient, and the number of aggregation rounds of federated averaging (FedAvg) subject to sum power and privacy constraints. We first present the privacy analysis based on differential privacy (DP) to quantify the impact of the alignment coefficient on privacy preservation in each communication round. Furthermore, to study how the device scheduling, alignment coefficient, and the number of the global aggregation affect the learning process, we conduct the convergence analysis of DP-OTA-FedAvg in the cases of convex and non-convex loss functions. Based on these analytical results, we formulate an optimization problem to minimize the optimality gap of the DP-OTA-FedAvg subject to limited sum power and privacy budgets. The problem is solved by decoupling it into two sub-problems. Given the number of communication rounds, we conclude the relationship between the number of scheduled devices and the alignment coefficient, which offers a set of potential optimal solution pairs of device scheduling and the alignment coefficient. Thanks to the reduced search space, the optimal solution can be efficiently obtained. The effectiveness of the proposed policy is validated through simulations.

alignment coefficient, artificial intelligence, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2305.03547

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom (0.04)
Europe > Italy > Lazio > Rome (0.04)

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback