Goto

Collaborating Authors

 ip-irm


Appendix

Neural Information Processing Systems

This is the Appendix for "Self-Supervised Learning Disentangled Group Representation as Feature". Table .1 summarizes the abbreviations and the symbols used in the main paper.Abbreviation/Symbol Meaning Abbreviation SSL Self-supervised Learning SL Supervised Learning DCI Disentangle Metric for Informativeness IRS Interventional Robustness Score EXP Explicitness Score MOD Modularity Score LR Logistic Regression GBT Gradient Boosted Trees OOD Out-Of-Distributed Symbol in Theory U Semantic space X V ector space I Image space G Group G ( x) Group orbit w.r .t.G containing the sample x ϕ Image generation process U I φ Visual representation I X f Semantic representation U X m The number of decomposed subgroups Symbol in Algorithm P Partition of dataset P Learned partition through Eq. (3) P Set of partitions used in Eq. (2) N Number of training images θ "Dummy" parameter used by IRM I Image X List of abbreviations and symbols used in the paper. Section A provides the preliminary knowledge about the group theory. Section D presents the additional experimental results. 1 A Preliminaries Groups often arise as transformations of some space, such as a set, vector space, or topological space. The set of clockwise rotations w.r .t. its centroid to retain We say this group of rotations act on the triangle, which is formally defined below.



Self-Supervised Learning Disentangled Group Representation as Feature

Neural Information Processing Systems

A good visual representation is an inference map from observations (images) to features (vectors) that faithfully reflects the hidden modularized generative factors (semantics). In this paper, we formulate the notion of good representation from a group-theoretic view using Higgins' definition of disentangled representation, and show that existing Self-Supervised Learning (SSL) only disentangles simple augmentation features such as rotation and colorization, thus unable to modularize the remaining semantics. To break the limitation, we propose an iterative SSL algorithm: Iterative Partition-based Invariant Risk Minimization (IP-IRM), which successfully grounds the abstract semantics and the group acting on them into concrete contrastive learning. At each iteration, IP-IRM first partitions the training samples into two subsets that correspond to an entangled group element.


Appendix

Neural Information Processing Systems

This is the Appendix for "Self-Supervised Learning Disentangled Group Representation as Feature". Table .1 summarizes the abbreviations and the symbols used in the main paper.Abbreviation/Symbol Meaning Abbreviation SSL Self-supervised Learning SL Supervised Learning DCI Disentangle Metric for Informativeness IRS Interventional Robustness Score EXP Explicitness Score MOD Modularity Score LR Logistic Regression GBT Gradient Boosted Trees OOD Out-Of-Distributed Symbol in Theory U Semantic space X V ector space I Image space G Group G ( x) Group orbit w.r .t.G containing the sample x ϕ Image generation process U I φ Visual representation I X f Semantic representation U X m The number of decomposed subgroups Symbol in Algorithm P Partition of dataset P Learned partition through Eq. (3) P Set of partitions used in Eq. (2) N Number of training images θ "Dummy" parameter used by IRM I Image X List of abbreviations and symbols used in the paper. Section A provides the preliminary knowledge about the group theory. Section D presents the additional experimental results. 1 A Preliminaries Groups often arise as transformations of some space, such as a set, vector space, or topological space. The set of clockwise rotations w.r .t. its centroid to retain We say this group of rotations act on the triangle, which is formally defined below.



Self-Supervised Learning Disentangled Group Representation as Feature

Neural Information Processing Systems

A good visual representation is an inference map from observations (images) to features (vectors) that faithfully reflects the hidden modularized generative factors (semantics). In this paper, we formulate the notion of "good" representation from a group-theoretic view using Higgins' definition of disentangled representation, and show that existing Self-Supervised Learning (SSL) only disentangles simple augmentation features such as rotation and colorization, thus unable to modularize the remaining semantics. To break the limitation, we propose an iterative SSL algorithm: Iterative Partition-based Invariant Risk Minimization (IP-IRM), which successfully grounds the abstract semantics and the group acting on them into concrete contrastive learning. At each iteration, IP-IRM first partitions the training samples into two subsets that correspond to an entangled group element. We prove that IP-IRM converges to a fully disentangled representation and show its effectiveness on various benchmarks.