timegan
- Asia > China > Beijing > Beijing (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
Privacy-Preserving Generative Modeling and Clinical Validation of Longitudinal Health Records for Chronic Disease
Ballyk, Benjamin D., Gupta, Ankit, Konda, Sujay, Subramanian, Kavitha, Landon, Chris, Naseer, Ahmed Ammar, Maierhofer, Georg, Swaminathan, Sumanth, Venkateshwaran, Vasudevan
Data privacy is a critical challenge in modern medical workflows as the adoption of electronic patient records has grown rapidly. Stringent data protection regulations limit access to clinical records for training and integrating machine learning models that have shown promise in improving diagnostic accuracy and personalized care outcomes. Synthetic data offers a promising alternative; however, current generative models either struggle with time-series data or lack formal privacy guaranties. In this paper, we enhance a state-of-the-art time-series generative model to better handle longitudinal clinical data while incorporating quantifiable privacy safeguards. Using real data from chronic kidney disease and ICU patients, we evaluate our method through statistical tests, a Train-on-Synthetic-Test-on-Real (TSTR) setup, and expert clinical review. Our non-private model (Augmented TimeGAN) outperforms transformer- and flow-based models on statistical metrics in several datasets, while our private model (DP-TimeGAN) maintains a mean authenticity of 0.778 on the CKD dataset, outperforming existing state-of-the-art models on the privacy-utility frontier. Both models achieve performance comparable to real data in clinician evaluations, providing robust input data necessary for developing models for complex chronic conditions without compromising data privacy.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- (11 more...)
- Research Report > Experimental Study (0.66)
- Research Report > New Finding (0.46)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.64)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
Fiaingen: A financial time series generative method matching real-world data quality
Rožanec, Jože M., Žezlin, Tina, Vasiliu, Laurentiu, Mladenić, Dunja, Prodan, Radu, Roman, Dumitru
Data is vital in enabling machine learning models to advance research and practical applications in finance, where accurate and robust models are essential for investment and trading decision-making. However, real-world data is limited despite its quantity, quality, and variety. The data shortage of various financial assets directly hinders the performance of machine learning models designed to trade and invest in these assets. Generative methods can mitigate this shortage. In this paper, we introduce a set of novel techniques for time series data generation (we name them Fiaingen) and assess their performance across three criteria: (a) overlap of real-world and synthetic data on a reduced dimensionality space, (b) performance on downstream machine learning tasks, and (c) runtime performance. Our experiments demonstrate that the methods achieve state-of-the-art performance across the three criteria listed above. Synthetic data generated with Fiaingen methods more closely mirrors the original time series data while keeping data generation time close to seconds - ensuring the scalability of the proposed approach. Furthermore, models trained on it achieve performance close to those trained with real-world data.
- Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
- Europe > Austria > Tyrol > Innsbruck (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Banking & Finance > Trading (1.00)
- Information Technology (0.93)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Generative Models for Long Time Series: Approximately Equivariant Recurrent Network Structures for an Adjusted Training Scheme
Fulek, Ruwen, Lange-Hegermann, Markus
We present a simple yet effective generative model for time series data based on a Varia-tional Autoencoder (V AE) with recurrent layers, referred to as the Recurrent Variational Autoencoder with Subsequent Training (R V AE-ST). Our method introduces an adapted training scheme that progressively increases the sequence length, addressing the challenge recurrent layers typically face when modeling long sequences. By leveraging the recurrent architecture, the model maintains a constant number of parameters regardless of sequence length. This design encourages approximate time-shift equivariance and enables efficient modeling of long-range temporal dependencies. Rather than introducing a fundamentally new architecture, we show that a carefully composed combination of known components can match or outperform state-of-the-art generative models on several benchmark datasets. Our model performs particularly well on time series that exhibit quasi-periodic structure, while remaining competitive on datasets with more irregular or partially non-stationary behavior. We evaluate its performance using ELBO, Fréchet Distance, discriminative scores, and visualizations of the learned embeddings.
- Research Report > New Finding (1.00)
- Research Report > Promising Solution (0.67)
- Research Report > Experimental Study > Negative Result (0.46)
A synthetic dataset of French electric load curves with temperature conditioning
Nabil, Tahar, Agoua, Ghislain, Cauchois, Pierre, De Moliner, Anne, Grossin, Benoît
The undergoing energy transition is causing behavioral changes in electricity use, e.g. with self-consumption of local generation, or flexibility services for demand control. To better understand these changes and the challenges they induce, accessing individual smart meter data is crucial. Yet this is personal data under the European GDPR. A widespread use of such data requires thus to create synthetic realistic and privacy-preserving samples. This paper introduces a new synthetic load curve dataset generated by conditional latent diffusion. We also provide the contracted power, time-of-use plan and local temperature used for generation. Fidelity, utility and privacy of the dataset are thoroughly evaluated, demonstrating its good quality and thereby supporting its interest for energy modeling applications.
- North America > United States (0.14)
- Europe > France (0.05)
- Oceania > Australia (0.04)
- Europe > Portugal (0.04)
- Information Technology > Security & Privacy (1.00)
- Energy > Power Industry (1.00)
Generating Realistic Synthetic Head Rotation Data for Extended Reality using Deep Learning
Struye, Jakob, Lemic, Filip, Famaey, Jeroen
Extended Reality is a revolutionary method of delivering multimedia content to users. A large contributor to its popularity is the sense of immersion and interactivity enabled by having real-world motion reflected in the virtual experience accurately and immediately. This user motion, mainly caused by head rotations, induces several technical challenges. For instance, which content is generated and transmitted depends heavily on where the user is looking. Seamless systems, taking user motion into account proactively, will therefore require accurate predictions of upcoming rotations. Training and evaluating such predictors requires vast amounts of orientational input data, which is expensive to gather, as it requires human test subjects. A more feasible approach is to gather a modest dataset through test subjects, and then extend it to a more sizeable set using synthetic data generation methods. In this work, we present a head rotation time series generator based on TimeGAN, an extension of the well-known Generative Adversarial Network, designed specifically for generating time series. This approach is able to extend a dataset of head rotations with new samples closely matching the distribution of the measured time series.
- Europe > Portugal > Lisbon > Lisbon (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
- (7 more...)
Generative Modeling and Data Augmentation for Power System Production Simulation
As a key component of power system production simulation, load forecasting is critical for the stable operation of power systems. Machine learning methods prevail in this field. However, the limited training data can be a challenge. This paper proposes a generative model-assisted approach for load forecasting under small sample scenarios, consisting of two steps: expanding the dataset using a diffusion-based generative model and then training various machine learning regressors on the augmented dataset to identify the best performer. The expanded dataset significantly reduces forecasting errors compared to the original dataset, and the diffusion model outperforms the generative adversarial model by achieving about 200 times smaller errors and better alignment in latent data distributions.
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Europe > Poland > Masovia Province > Warsaw (0.04)
- Energy > Power Industry (1.00)
- Machinery > Industrial Machinery (0.82)