TVLT: TextlessVision-LanguageTransformer