Representing Data as Atoms: Unifying Intra- and Inter-Sample Relationship to Discretize Data Representation

Tuan, Yi-Lin, Chiu, Zih-Yun, Wang, William Yang

Dec-3-2023–arXiv.org Machine Learning

The quality of data representation is paramount for the performance of a model. Recent research has focused on enhancing representation learning by incorporating more information about the intra-sample structures of individual data points, such as local and global attention. Additionally, researchers have explored methods to model the inter-sample relationships, including manifold, contrastive, and discrete representation learning. In this study, we introduce a new training loss, which considers both intra-sample structure and inter-sample relationships, leveraging the concept of {\it atoms} to represent data points. This new approach, {\it Atom Modeling}, offers a fresh perspective to discretize data representations within a continuous space. Through experiments, we demonstrate that Atom Modeling enhances the performance of existing models in tasks involving classification and generation, across diverse domains including vision and language. These findings underscore the potential of Atom Modeling to enhance data representation and improve model learning, suggesting a promising direction for future research.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

Dec-3-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.28)

Genre:
- Research Report
  - Experimental Study (0.68)
  - New Finding (0.86)

Industry:
- Health & Medicine > Diagnostic Medicine (0.46)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning
      - Neural Networks > Deep Learning (1.00)
      - Statistical Learning (0.93)
    - Natural Language (1.00)
    - Representation & Reasoning (1.00)
    - Vision (1.00)
  - Data Science (0.88)
  - Sensing and Signal Processing > Image Processing (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found