Goto

Collaborating Authors

 self-supervised task





Augmentation-Aware Self-Supervision for Data-Efficient GAN Training

Neural Information Processing Systems

We further encourage the generator to adversarially learn from the self-supervised discriminator by generating augmentation-predictable real and not fake data.





Self-SupervisedFew-ShotLearningonPointClouds

Neural Information Processing Systems

Furthermore, our self-supervised learning network is restricted to pre-train on the support set (comprising of scarce training examples) used to train the downstream network in a few-shot learning (FSL) setting. Finally, the fully-trained self-supervised network's point embeddings are input to the downstream task's network.


Self-Supervised GANs with Label Augmentation

Neural Information Processing Systems

Recently, transformation-based self-supervised learning has been applied to generative adversarial networks (GANs) to mitigate catastrophic forgetting in the discriminator by introducing a stationary learning environment. However, the separate self-supervised tasks in existing self-supervised GANs cause a goal inconsistent with generative modeling due to the fact that their self-supervised classifiers are agnostic to the generator distribution. To address this problem, we propose a novel self-supervised GAN that unifies the GAN task with the self-supervised task by augmenting the GAN labels (real or fake) via self-supervision of data transformation. Specifically, the original discriminator and self-supervised classifier are unified into a label-augmented discriminator that predicts the augmented labels to be aware of both the generator distribution and the data distribution under every transformation, and then provide the discrepancy between them to optimize the generator. Theoretically, we prove that the optimal generator could converge to replicate the real data distribution. Empirically, we show that the proposed method significantly outperforms previous self-supervised and data augmentation GANs on both generative modeling and representation learning across benchmark datasets.


Structurally Refined Graph Transformer for Multimodal Recommendation

Shi, Ke, Zhang, Yan, Zhang, Miao, Chen, Lifan, Yi, Jiali, Xiao, Kui, Hou, Xiaoju, Li, Zhifei

arXiv.org Artificial Intelligence

Abstract--Multimodal recommendation systems utilize various types of information, including images and text, to enhance the effectiveness of recommendations. The key challenge is predicting user purchasing behavior from the available data. They also rely heavily on a single semantic framework (e.g., local or global semantics), resulting in an incomplete or biased representation of user preferences, particularly those less expressed in prior interactions. Furthermore, these approaches fail to capture the complex interactions between users and items limiting the model's ability to meet diverse users. T o address these challenges, we present SRGFormer, a structurally optimized multimodal recommendation model. By modifying the transformer for better integration into our model, we capture the overall behavior patterns of users. Then, we enhance structural information by embedding multimodal information into a hypergraph structure to aid in learning the local structures between users and items. Meanwhile, applying self-supervised tasks to user-item collaborative signals enhances the integration of multimodal information, thereby revealing the representational features inherent to the data's modality. Extensive experiments on three public datasets reveal that SRGFormer surpasses previous benchmark models, achieving an average performance improvement of 4.47% on the Sports dataset. The swift growth of online data has led platforms to implement multimodal recommendation systems, initially using collaborative filtering (CF) to analyze user preferences from historical interactions [1], [2]. However, CF struggles to handle sparse or non-existent interaction records leading to less accurate predictions.