Denoising Bottleneck with Mutual Information Maximization for Video Multimodal Fusion