Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation