Efficiently Learning Small Policies for Locomotion and Manipulation