Vision-Language Fusion for Object Recognition