GLUE: Global-Local Unified Encoding for Imitation Learning via Key-Patch Tracking