Using Closed Captions as Supervision for Video Activity Recognition