Goto

Collaborating Authors

 casd




Comprehensive Attention Self-Distillation for Weakly-Supervised Object Detection Supplementary Material

Neural Information Processing Systems

Bottom: CASD overlaid with attentions. Recall that WSOD conducts classification on object proposals (e.g., bounding boxes generated by Selective Search [ Figure 1 shows both the success and the failure cases of CASD. This could be improved by hard-sample mining in CASD training. This localization advantages of CASD benefit from its learning of comprehensive attention (see the bottom row of Figure 1). CorLoc only evaluates the localization accuracy of detectors.



Confidence-Aware Self-Distillation for Multimodal Sentiment Analysis with Incomplete Modalities

Luo, Yanxi, Wang, Shijin, Xu, Zhongxing, Li, Yulong, Tang, Feilong, Su, Jionglong

arXiv.org Artificial Intelligence

Multimodal sentiment analysis (MSA) aims to understand human sentiment through multimodal data. In real-world scenarios, practical factors often lead to uncertain modality missingness. Existing methods for handling modality missingness are based on data reconstruction or common subspace projections. However, these methods neglect the confidence in multimodal combinations and impose constraints on intra-class representation, hindering the capture of modality-specific information and resulting in suboptimal performance. To address these challenges, we propose a Confidence-Aware Self-Distillation (CASD) strategy that effectively incorporates multimodal probabilistic embeddings via a mixture of Student's $t$-distributions, enhancing its robustness by incorporating confidence and accommodating heavy-tailed properties. This strategy estimates joint distributions with uncertainty scores and reduces uncertainty in the student network by consistency distillation. Furthermore, we introduce a reparameterization representation module that facilitates CASD in robust multimodal learning by sampling embeddings from the joint distribution for the prediction module to calculate the task loss. As a result, the directional constraint from the loss minimization is alleviated by the sampled representation. Experimental results on three benchmark datasets demonstrate that our method achieves state-of-the-art performance.