Towards flexible perception with visual memory