Bridging the Gap Between Object Detection and User Intent via Query-Modulation