Less is More : A Closer Look at Multi-Modal Few-Shot Learning