zaidi
Generalization Guarantees for Multi-View Representation Learning and Application to Regularization via Gaussian Product Mixture Prior
Sefidgaran, Milad, Zaidi, Abdellatif, Krasnowski, Piotr
We study the problem of distributed multi-view representation learning. In this problem, $K$ agents observe each one distinct, possibly statistically correlated, view and independently extracts from it a suitable representation in a manner that a decoder that gets all $K$ representations estimates correctly the hidden label. In the absence of any explicit coordination between the agents, a central question is: what should each agent extract from its view that is necessary and sufficient for a correct estimation at the decoder? In this paper, we investigate this question from a generalization error perspective. First, we establish several generalization bounds in terms of the relative entropy between the distribution of the representations extracted from training and "test" datasets and a data-dependent symmetric prior, i.e., the Minimum Description Length (MDL) of the latent variables for all views and training and test datasets. Then, we use the obtained bounds to devise a regularizer; and investigate in depth the question of the selection of a suitable prior. In particular, we show and conduct experiments that illustrate that our data-dependent Gaussian mixture priors with judiciously chosen weights lead to good performance. For single-view settings (i.e., $K=1$), our experimental results are shown to outperform existing prior art Variational Information Bottleneck (VIB) and Category-Dependent VIB (CDVIB) approaches. Interestingly, we show that a weighted attention mechanism emerges naturally in this setting. Finally, for the multi-view setting, we show that the selection of the joint prior as a Gaussians product mixture induces a Gaussian mixture marginal prior for each marginal view and implicitly encourages the agents to extract and output redundant features, a finding which is somewhat counter-intuitive.
Large Language Model Driven Agents for Simulating Echo Chamber Formation
Gu, Chenhao, Luo, Ling, Zaidi, Zainab Razia, Karunasekera, Shanika
The rise of echo chambers on social media platforms has heightened concerns about polarization and the reinforcement of existing beliefs. Traditional approaches for simulating echo chamber formation have often relied on predefined rules and numerical simulations, which, while insightful, may lack the nuance needed to capture complex, real-world interactions. In this paper, we present a novel framework that leverages large language models (LLMs) as generative agents to simulate echo chamber dynamics within social networks. The novelty of our approach is that it incorporates both opinion updates and network rewiring behaviors driven by LLMs, allowing for a context-aware and semantically rich simulation of social interactions. Additionally, we utilize real-world Twitter (now X) data to benchmark the LLM-based simulation against actual social media behaviors, providing insights into the accuracy and realism of the generated opinion trends. Our results demonstrate the efficacy of LLMs in modeling echo chamber formation, capturing both structural and semantic dimensions of opinion clustering.
Federated Learning You May Communicate Less Often!
Sefidgaran, Milad, Chor, Romain, Zaidi, Abdellatif, Wan, Yijun
We investigate the generalization error of statistical learning models in a Federated Learning (FL) setting. Specifically, we study the evolution of the generalization error with the number of communication rounds between the clients and the parameter server, i.e., the effect on the generalization error of how often the local models as computed by the clients are aggregated at the parameter server. We establish PAC-Bayes and rate-distortion theoretic bounds on the generalization error that account explicitly for the effect of the number of rounds, say $ R \in \mathbb{N}$, in addition to the number of participating devices $K$ and individual datasets size $n$. The bounds, which apply in their generality for a large class of loss functions and learning algorithms, appear to be the first of their kind for the FL setting. Furthermore, we apply our bounds to FL-type Support Vector Machines (FSVM); and we derive (more) explicit bounds on the generalization error in this case. In particular, we show that the generalization error of FSVM increases with $R$, suggesting that more frequent communication with the parameter server diminishes the generalization power of such learning algorithms. Combined with that the empirical risk generally decreases for larger values of $R$, this indicates that $R$ might be a parameter to optimize in order to minimize the population risk of FL algorithms. Moreover, specialized to the case $R=1$ (sometimes referred to as "one-shot" FL or distributed learning) our bounds suggest that the generalization error of the FL setting decreases faster than that of centralized learning by a factor of $\mathcal{O}(\sqrt{\log(K)/K})$, thereby generalizing recent findings in this direction to arbitrary loss functions and algorithms. The results of this paper are also validated on some experiments.
Massive "coffeetech" investments are giving startups a jolt
On the side of a dusty Jakarta road, nestled in the corner of a gas station, might be the future of coffee. Driving home, you might preorder a cup as you plan to refill the car, even before pulling up to the gas station. Open the Kopi Kenangan app, click on "preorder," and choose one -- perhaps a mellow coffee with just a hint of acidity, creamy with milk. By the time you arrive at the station, the iced coffee, made by a human, will be sitting on the countertop, sweating in a plastic cup in the Jakarta humidity. Indonesian coffee chain Kopi Kenangan -- "coffee memories" in Bahasa Indonesia -- did not set out to be a tech-powered coffee chain.
In-house training lets Accelirate grow
At Accelirate, an automation startup, few newcomers to the IT staff claim to be experts in critical areas like robotic process automation (RPA) or machine learning, but everyone has the chance to become one. The Edison, N.J.-based company, which was launched last year to assist companies on the automation track, is now up to 120 employees, 90% residing in IT, and it has debuted on Computerworld's annual Best Places to Work in IT list as the No. 11 small organization. Since RPA and related technologies are treading new ground, Accelirate found itself facing a dearth of expert talent, which could put a damper on its plan for fast-paced growth. The solution: building an in-house, three-month training program that gets all new IT hires, both first-time job holders and seasoned veterans, quickly up to speed. "Not too many people have prior experience with the platforms or technologies we were working with -- finding someone who'd done RPA before was few and far between," says Ahmed Zaidi, Accelirate's chief automation officer.