Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision