BridgingtheGapbetweenObjectandImage-level RepresentationsforOpen-VocabularyDetection