Ceritli, Taha
Unlocking the Value of Decentralized Data: A Federated Dual Learning Approach for Model Aggregation
Zhu, Junyi, Yao, Ruicong, Ceritli, Taha, Ozkan, Savas, Blaschko, Matthew B., Noh, Eunchung, Min, Jeongwon, Min, Cho Jung, Ozay, Mete
Artificial Intelligence (AI) technologies have revolutionized numerous fields, yet their applications often rely on costly and time-consuming data collection processes. Federated Learning (FL) offers a promising alternative by enabling AI models to be trained on decentralized data where data is scattered across clients (distributed nodes). However, existing FL approaches struggle to match the performance of centralized training due to challenges such as heterogeneous data distribution and communication delays, limiting their potential for breakthroughs. W e observe that many real-world use cases involve hybrid data regimes, in which a server (center node) has access to some data while a large amount of data is distributed across associated clients. T o improve the utilization of decentralized data under this regime, address data heterogeneity issue, and facilitate asynchronous communication between the server and clients, we propose a dual learning approach that leverages centralized data at the server to guide the merging of model updates from clients. Our method accommodates scenarios where server data is out-of-domain relative to decentralized client data, making it applicable to a wide range of use cases. W e provide theoretical analysis demonstrating the faster convergence of our method compared to existing methods. Furthermore, experimental results across various scenarios show that our approach significantly outperforms existing technologies, highlighting its potential to unlock the value of large amounts of decentralized data.
Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation
Malinovsky, Grigory, Michieli, Umberto, Hammoud, Hasan Abed Al Kader, Ceritli, Taha, Elesedy, Hayder, Ozay, Mete, Richtรกrik, Peter
Fine-tuning has become a popular approach to adapting large foundational models to specific tasks. As the size of models and datasets grows, parameter-efficient fine-tuning techniques are increasingly important. One of the most widely used methods is Low-Rank Adaptation (LoRA), with adaptation update expressed as the product of two low-rank matrices. While LoRA was shown to possess strong performance in fine-tuning, it often under-performs when compared to full-parameter fine-tuning (FPFT). Although many variants of LoRA have been extensively studied empirically, their theoretical optimization analysis is heavily under-explored. The starting point of our work is a demonstration that LoRA and its two extensions, Asymmetric LoRA and Chain of LoRA, indeed encounter convergence issues. To address these issues, we propose Randomized Asymmetric Chain of LoRA (RAC-LoRA) -- a general optimization framework that rigorously analyzes the convergence rates of LoRA-based methods. Our approach inherits the empirical benefits of LoRA-style heuristics, but introduces several small but important algorithmic modifications which turn it into a provably convergent method. Our framework serves as a bridge between FPFT and low-rank adaptation. We provide provable guarantees of convergence to the same solution as FPFT, along with the rate of convergence. Additionally, we present a convergence analysis for smooth, non-convex loss functions, covering gradient descent, stochastic gradient descent, and federated learning settings. Our theoretical findings are supported by experimental results.
Synthesizing Mixed-type Electronic Health Records using Diffusion Models
Ceritli, Taha, Ghosheh, Ghadeer O., Chauhan, Vinod Kumar, Zhu, Tingting, Creagh, Andrew P., Clifton, David A.
Electronic Health Records (EHRs) contain sensitive patient information, which presents privacy concerns when sharing such data. Synthetic data generation is a promising solution to mitigate these risks, often relying on deep generative models such as Generative Adversarial Networks (GANs). However, recent studies have shown that diffusion models offer several advantages over GANs, such as generation of more realistic synthetic data and stable training in generating data modalities, including image, text, and sound. In this work, we investigate the potential of diffusion models for generating realistic mixed-type tabular EHRs, comparing TabDDPM model with existing methods on four datasets in terms of data quality, utility, privacy, and augmentation. Our experiments demonstrate that TabDDPM outperforms the state-of-the-art models across all evaluation metrics, except for privacy, which confirms the trade-off between privacy and utility.
Mixture of Input-Output Hidden Markov Models for Heterogeneous Disease Progression Modeling
Ceritli, Taha, Creagh, Andrew P., Clifton, David A.
A practical solution to these problems has been using hidden Markov models (HMMs), which (i) can A particular challenge for disease progression be trained using small datasets, (ii) can handle missing data modeling is the heterogeneity of a disease and in a principled approach and (iii) are interpretable models, its manifestations in the patients. Existing approaches e.g., it is possible to relate inferred latent states to particular often assume the presence of a single symptoms. Most existing HMMs (Jackson et al., 2003; disease progression characteristics which is unlikely Sukkar et al., 2012; Guihenneuc-Jouyaux et al., 2000; Wang for neurodegenerative disorders such as et al., 2014; Sun et al., 2019; Severson et al., 2020; 2021), Parkinson' disease. In this paper, we propose however, assume that each patient follows the same latent a hierarchical time-series model that can discover state transition dynamics, ignoring the heterogeneity in the multiple disease progression dynamics. The proposed disease progression dynamics.