Realistic evaluation of transductive few-shot learning