Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces

Yu-An Chung, Wei-Hung Weng, Schrasing Tong, James Glass

Feb-12-2026, 09:31:31 GMT–Neural Information Processing Systems

Recently, there is an increasing interest in learning the semantics of a language directly, and only from rawspeech [24,27,28].

machine learning, natural language, translation, (18 more...)

Neural Information Processing Systems

Feb-12-2026, 09:31:31 GMT

Conferences PDF

Country:
- North America
  - United States > Massachusetts
    - Middlesex County > Cambridge (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe > Italy
  - Calabria > Catanzaro Province > Catanzaro (0.04)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks (0.95)

Duplicate Docs Excel Report

Title
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces

Similar Docs Excel Report more

Title	Similarity	Source
None found