Goto

Collaborating Authors

 fed


Scaling Law Analysis in Federated Learning: How to Select the Optimal Model Size?

Chen, Xuanyu, Yang, Nan, Wang, Shuai, Yuan, Dong

arXiv.org Artificial Intelligence

The recent success of large language models (LLMs) has sparked a growing interest in training large-scale models. As the model size continues to scale, concerns are growing about the depletion of high-quality, well-curated training data. This has led practitioners to explore training approaches like Federated Learning (FL), which can leverage the abundant data on edge devices while maintaining privacy. However, the decentralization of training datasets in FL introduces challenges to scaling large models, a topic that remains under-explored. This paper fills this gap and provides qualitative insights on generalizing the previous model scaling experience to federated learning scenarios. Specifically, we derive a P AC-Bayes (Probably Approximately Correct Bayesian) upper bound for the generalization error of models trained with stochastic algorithms in federated settings and quantify the impact of distributed training data on the optimal model size by finding the analytic solution of model size that minimizes this bound. Our theoretical results demonstrate that the optimal model size has a negative power law relationship with the number of clients if the total training compute is unchanged. Besides, we also find that switching to FL with the same training compute will inevitably reduce the upper bound of generalization performance that the model can achieve through training, and that estimating the optimal model size in federated scenarios should depend on the average training compute across clients. Furthermore, we also empirically validate the correctness of our results with extensive training runs on different models, network settings, and datasets.


Federal Reserve Communication and the COVID-19 Pandemic

Benchimol, Jonathan, Kazinnik, Sophia, Saadon, Yossi

arXiv.org Machine Learning

In this study, we examine the Federal Reserve's communication strategies during the COVID-19 pandemic, comparing them with communication during previous periods of economic stress. Using specialized dictionaries tailored to COVID-19, unconventional monetary policy (UMP), and financial stability, combined with sentiment analysis and topic modeling techniques, we identify a distinct focus in Fed communication during the pandemic on financial stability, market volatility, social welfare, and UMP, characterized by notable contextual uncertainty. Through comparative analysis, we juxtapose the Fed's communication during the COVID-19 crisis with its responses during the dot-com and global financial crises, examining content, sentiment, and timing dimensions. Our findings reveal that Fed communication and policy actions were more reactive to the COVID-19 crisis than to previous crises. Additionally, declining sentiment related to financial stability in interest rate announcements and minutes anticipated subsequent accommodative monetary policy decisions. We further document that communicating about UMP has become the "new normal" for the Fed's Federal Open Market Committee meeting minutes and Chairman's speeches since the Global Financial Crisis, reflecting an institutional adaptation in communication strategy following periods of economic distress. These findings contribute to our understanding of how central bank communication evolves during crises and how communication strategies adapt to exceptional economic circumstances.


Parameter-free entropy-regularized multi-view clustering with hierarchical feature selection

Sinaga, Kristina P., Colantonio, Sara, Yang, Miin-Shen

arXiv.org Artificial Intelligence

Multi - view clustering faces critical challenges in automatically discovering patterns across heterogeneous data while managing high - dimensional features and eliminating irrelevant information. Traditional approaches suffer from manual parameter tuning and lack principled cross - view integration mechanisms. This work introduces two complementary algorithms: AMVFCM - U and AAMVFCM - U, providing a unified parameter - free framework. Our approach replaces fuzzification parameters with entropy regularization terms tha t enforce adaptive cross - view consensus. The core innovation employs signal - to - noise ratio based regularization for principled feature weighting with convergence guarantees, coupled with dual - level entropy terms that automatically balance view and feature contributions. AAMVFCM - U extends this with hierarchical dimensionality reduction operating at feature and view levels through adaptive thresholding . Evaluation across five diverse benchmarks demonstrates superiority over 15 state - of - the - art methods. AAMVFCM - U achieves up to 97% computational efficiency gains, reduces dimensionality to 0.45% of original size, and automatically identifies critical view combinations for optimal pattern discovery. Keywords: Multi - view clustering, Dimensionality reduction, Feature selection, Parameter - free, Signal - to - noise ratio, Fuzzy c - means 1. Introduction Understanding complex data is crucial in today's data - driven world, and recent advancements in machine learning are significantly enhancing our ability to analyze and interpret this information.


Targeted Data Fusion for Causal Survival Analysis Under Distribution Shift

Liu, Yi, Levis, Alexander W., Zhu, Ke, Yang, Shu, Gilbert, Peter B., Han, Larry

arXiv.org Machine Learning

Causal inference across multiple data sources has the potential to improve the generalizability, transportability, and replicability of scientific findings. However, data integration methods for time-to-event outcomes -- common in medical contexts such as clinical trials -- remain underdeveloped. Existing data fusion methods focus on binary or continuous outcomes, neglecting the distinct challenges of survival analysis, including right-censoring and the unification of discrete and continuous time frameworks. To address these gaps, we propose two novel approaches for multi-source causal survival analysis. First, considering a target site-specific causal effect, we introduce a semiparametric efficient estimator for scenarios where data-sharing is feasible. Second, we develop a federated learning framework tailored to privacy-constrained environments. This framework dynamically adjusts source site-specific contributions, downweighting biased sources and upweighting less biased ones relative to the target population. Both approaches incorporate nonparametric machine learning models to enhance robustness and efficiency, with theoretical guarantees applicable to both continuous and discrete time-to-event outcomes. We demonstrate the practical utility of our methods through extensive simulations and an application to two randomized trials of a monoclonal neutralizing antibody for HIV-1 prevention: HVTN 704/HPTN 085 (cisgender men and transgender persons in the Americas and Switzerland) and HVTN 703/HPTN 081 (women in sub-Saharan Africa). The results highlight the potential of our approaches to efficiently estimate causal effects while addressing heterogeneity across data sources and adhering to privacy and robustness constraints.


Game Developers Are Getting Fed Up With Their Bosses' AI Initiatives

WIRED

The video game industry has been in a troubled place for the past year, with studio closures and job security at the forefront of developer concerns. Increasing layoffs with seemingly no end paint a bleak picture for devs, while companies are busy pumping money into AI initiatives. According to a new report from the organizers of the Game Developers Conference, 52 percent of devs surveyed said they worked at companies that were using generative AI on their games. Of the 3,000 people surveyed, roughly half said they were concerned about the technology's impact on the industry and an increasing number reported they felt negatively about AI overall. The "State of the Game Industry" report, released Tuesday, is one of a series of surveys conducted each year by GDC organizers prior to their annual conference.


FED: Fast and Efficient Dataset Deduplication Framework with GPU Acceleration

Son, Youngjun, Kim, Chaewon, Lee, Jaejin

arXiv.org Artificial Intelligence

Dataset deduplication plays a crucial role in enhancing data quality, ultimately improving training performance and efficiency of LLMs. A commonly used method for data deduplication is the MinHash LSH algorithm. Recently, NVIDIA introduced a GPU-based MinHash LSH deduplication method, but it remains suboptimal, leaving room for further improvement in processing efficiency. This paper proposes a GPU-accelerated deduplication framework \sys that optimizes MinHash LSH for GPU clusters and leverages computationally efficient and partially reusable non-cryptographic hash functions. \sys significantly outperforms the CPU-based deduplication tool included in SlimPajama by up to 58.3 times and the GPU-based deduplication tool included in NVIDIA NeMo Curator by up to 8.6 times when processing 1 million documents with a node of four GPUs. Deduplication of 1.2 trillion tokens is completed in just 5.1 hours in a four-node, 16-GPU environment. The related code is publicly available on GitHub (https://github.com/mcrl/FED).


FMPAF: How Do Fed Chairs Affect the Financial Market? A Fine-grained Monetary Policy Analysis Framework on Their Language

Deng, Yayue, Xu, Mohan, Tang, Yao

arXiv.org Artificial Intelligence

The effectiveness of central bank communication is a crucial aspect of monetary policy transmission. While recent research has examined the influence of policy communication by the chairs of the Federal Reserve on various financial variables, much of the literature relies on rule-based or dictionary-based methods in parsing the language of the chairs, leaving nuanced information about policy stance contained in nonverbal emotion out of the analysis. In the current study, we propose the Fine-Grained Monetary Policy Analysis Framework (FMPAF), a novel approach that integrates large language models (LLMs) with regression analysis to provide a comprehensive analysis of the impact of the press-conference communications of chairs of the Federal Reserve on financial markets. We conduct extensive comparisons of model performance under different levels of granularity, modalities, and communication scenarios. Based on our preferred specification, a one-unit increase in the sentiment score is associated with an increase of the price of S\&P 500 Exchange-Traded Fund by approximately 500 basis points, a 15-basis-point decrease in the policy interest rate, while not leading to a significant response in exchange rates.


Analysis of the Fed's communication by using textual entailment model of Zero-Shot classification

Nakayama, Yasuhiro, Sawaki, Tomochika

arXiv.org Artificial Intelligence

The statement is a relatively short have a broad and significant impact on financial market document of about two pages that summarizes current trends, pricing of risky assets, and spillover to the real economic perceptions, the monetary policy determined, economy, market participants are trying to better and the names of the voters. The transcripts of the press understand the changes in the future monetary policy conference consist of a transcript to be read by the outlook of central banks. In particular, the monetary policy chairperson at the beginning of the conference, as well as of the Central Bank of the United States (Federal Reserve questions and answers with reporters, and are System, hereinafter Fed) is positioned as the most approximately 20 ~ 30 pages in volume. In some cases, important because it influences the movement of the dollar, information that is not included in the statement but is of the key currency. One of the means by which central banks interest to market participants (specific information and engage in dialogue with the market and conduct smooth future prospects) is recorded. The minutes are a document policy management is the publication of various that confirms the content of the economic analysis documents, including statements and minutes issued after reported by the Fed economists, the process of discussion policy meetings, and transcripts of speeches and that led to the decision of the policy, and the variation of congressional testimony attended by senior officials. The opinion among the members, and the volume is around Federal Open Market Committee (FOMC), a meeting at 10~20 pages. Outside of the FOMC meetings, transcripts which U.S. monetary policy is formulated, meets eight of speeches and interviews by FOMC participants (Fed times a year with members of the Federal Reserve Board officials) and statements in congressional testimony will (FRB) and the presidents of the regional Fed banks as be released at each meeting.


Should Bank Stress Tests Be Fair?

Glasserman, Paul, Li, Mike

arXiv.org Artificial Intelligence

Regulatory stress tests have become one of the main tools for setting capital requirements at the largest U.S. banks. The Federal Reserve uses confidential models to evaluate bank-specific outcomes for bank-specific portfolios in shared stress scenarios. As a matter of policy, the same models are used for all banks, despite considerable heterogeneity across institutions; individual banks have contended that some models are not suited to their businesses. Motivated by this debate, we ask, what is a fair aggregation of individually tailored models into a common model? We argue that simply pooling data across banks treats banks equally but is subject to two deficiencies: it may distort the impact of legitimate portfolio features, and it is vulnerable to implicit misdirection of legitimate information to infer bank identity. We compare various notions of regression fairness to address these deficiencies, considering both forecast accuracy and equal treatment. In the setting of linear models, we argue for estimating and then discarding centered bank fixed effects as preferable to simply ignoring differences across banks. We present evidence that the overall impact can be material. We also discuss extensions to nonlinear models.


This $6 trillion problem threatens to push inflation even higher

FOX News

Stanford Graduate School of Business lecturer Dave Dodson claims the Biden admin's handling of the economy is to'tinker' with it'like it's a video game' on'Your World with Neil Cavuto.' Following the 2008 global financial crisis, the Federal Reserve created trillions of dollars to ease financial conditions and keep banks afloat. Many economists predicted record inflation would result. But Fed Chairman Ben Bernanke pulled an ace out of his sleeve. He paid banks to park much of that money at the Fed and limit its inflationary effects.