Beam Selection in ISAC using Contextual Bandit with Multi-modal Transformer and Transfer Learning