Pseudo-Non-Linear Data Augmentation via Energy Minimization

Oct-1-2024–arXiv.org Artificial Intelligence

We propose a novel and interpretable data augmentation method based on energybased modeling and principles from information geometry. Unlike black-box generative models, which rely on deep neural networks, our approach replaces these non-interpretable transformations with explicit, theoretically grounded ones, ensuring interpretability and strong guarantees such as energy minimization. Central to our method is the introduction of the backward projection algorithm, which reverses dimension reduction to generate new data. Empirical results demonstrate that our method achieves competitive performance with black-box generative models while offering greater transparency and interpretability. Data augmentation has advanced significantly in recent years, primarily due to the increasing use of generative models to meet the growing demand for large datasets (Feng et al., 2021; Wong et al., 2016). Despite their success, these generative models often rely on modern deep neural networks, which are typically treated as black boxes, raising concerns about their interpretability (Guidotti et al., 2018). For instance, the popular autoencoder model encodes original data into a compact latent representation and then decodes it back, with both processes usually handled by black-box neural networks (Kingma & Welling, 2022). Consequently, even when these models perform well, the lack of understanding of the underlying transformations makes it difficult to control the generated outputs, forcing researchers to depend heavily on empirical heuristics. A natural approach to developing a more interpretable data augmentation method is to replace blackbox transformations with more explicit ones (Rudin, 2019).

algorithm 4, data augmentation, projection, (10 more...)

arXiv.org Artificial Intelligence

Oct-1-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Illinois > Champaign County > Urbana (0.04)
- Europe
  - Russia (0.04)
  - Poland (0.04)
  - Greece (0.04)
  - Czechia > Prague (0.04)
- Asia
  - Taiwan (0.04)
  - Russia (0.04)
  - Japan (0.04)

Genre:
- Research Report > New Finding (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)