self-supervised task
- Asia > China (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Singapore (0.05)
- North America > Canada (0.04)
- Europe > Austria > Vienna (0.14)
- Asia > China > Guangdong Province > Guangzhou (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- (13 more...)
- North America > United States (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Europe > Germany > Brandenburg > Potsdam (0.05)
- North America > United States > California > San Diego County > San Diego (0.05)
- North America > Canada (0.05)
- Health & Medicine > Diagnostic Medicine > Imaging (0.84)
- Health & Medicine > Health Care Technology (0.65)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Asia > China > Beijing > Beijing (0.04)
Self-SupervisedFew-ShotLearningonPointClouds
Furthermore, our self-supervised learning network is restricted to pre-train on the support set (comprising of scarce training examples) used to train the downstream network in a few-shot learning (FSL) setting. Finally, the fully-trained self-supervised network's point embeddings are input to the downstream task's network.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > India (0.04)
Self-Supervised GANs with Label Augmentation
Recently, transformation-based self-supervised learning has been applied to generative adversarial networks (GANs) to mitigate catastrophic forgetting in the discriminator by introducing a stationary learning environment. However, the separate self-supervised tasks in existing self-supervised GANs cause a goal inconsistent with generative modeling due to the fact that their self-supervised classifiers are agnostic to the generator distribution. To address this problem, we propose a novel self-supervised GAN that unifies the GAN task with the self-supervised task by augmenting the GAN labels (real or fake) via self-supervision of data transformation. Specifically, the original discriminator and self-supervised classifier are unified into a label-augmented discriminator that predicts the augmented labels to be aware of both the generator distribution and the data distribution under every transformation, and then provide the discrepancy between them to optimize the generator. Theoretically, we prove that the optimal generator could converge to replicate the real data distribution. Empirically, we show that the proposed method significantly outperforms previous self-supervised and data augmentation GANs on both generative modeling and representation learning across benchmark datasets.
Structurally Refined Graph Transformer for Multimodal Recommendation
Shi, Ke, Zhang, Yan, Zhang, Miao, Chen, Lifan, Yi, Jiali, Xiao, Kui, Hou, Xiaoju, Li, Zhifei
Abstract--Multimodal recommendation systems utilize various types of information, including images and text, to enhance the effectiveness of recommendations. The key challenge is predicting user purchasing behavior from the available data. They also rely heavily on a single semantic framework (e.g., local or global semantics), resulting in an incomplete or biased representation of user preferences, particularly those less expressed in prior interactions. Furthermore, these approaches fail to capture the complex interactions between users and items limiting the model's ability to meet diverse users. T o address these challenges, we present SRGFormer, a structurally optimized multimodal recommendation model. By modifying the transformer for better integration into our model, we capture the overall behavior patterns of users. Then, we enhance structural information by embedding multimodal information into a hypergraph structure to aid in learning the local structures between users and items. Meanwhile, applying self-supervised tasks to user-item collaborative signals enhances the integration of multimodal information, thereby revealing the representational features inherent to the data's modality. Extensive experiments on three public datasets reveal that SRGFormer surpasses previous benchmark models, achieving an average performance improvement of 4.47% on the Sports dataset. The swift growth of online data has led platforms to implement multimodal recommendation systems, initially using collaborative filtering (CF) to analyze user preferences from historical interactions [1], [2]. However, CF struggles to handle sparse or non-existent interaction records leading to less accurate predictions.