Benchmarking Pretrained Attention-based Models for Real-Time Recognition in Robot-Assisted Esophagectomy