Efficient In-Context Learning in Vision-Language Models for Egocentric Videos