Vectorizing string entries for data processing on tables: when are larger language models better?