Learning to Maximize Mutual Information for Dynamic Feature Selection