MITFAS: Mutual Information based Temporal Feature Alignment and Sampling for Aerial Video Action Recognition