Unified Lexical Representation for Interpretable Visual-Language Alignment Yifan Li1 Yikai Wang 1 Dongyu Ru