Cognitive resilience: Unraveling the proficiency of image-captioning models to interpret masked visual content