AITopics | tokenverse

Collaborating Authors

tokenverse

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TokenVerse++: Towards Flexible Multitask Learning with Dynamic Task Activation

Kumar, Shashi, Madikeri, Srikanth, Villatoro-Tello, Esaú, Burdisso, Sergio, Rangappa, Pradeep, Carofilis, Andrés, Motlicek, Petr, Pandia, Karthik, Venkatesan, Shankar, Hacioğlu, Kadri, Stolcke, Andreas

arXiv.org Artificial IntelligenceAug-28-2025

--T oken-based multitasking frameworks like T oken-V erse require all training utterances to have labels for all tasks, hindering their ability to leverage partially annotated datasets and scale effectively. We propose T okenV erse++, which introduces learnable vectors in the acoustic embedding space of the XLSR-Transducer ASR model for dynamic task activation. This core mechanism enables training with utterances labeled for only a subset of tasks, a key advantage over T okenV erse. We demonstrate this by successfully integrating a dataset with partial labels, specifically for ASR and an additional task, language identification, improving overall performance. T okenV erse++ achieves results on par with or exceeding T okenV erse across multiple tasks, establishing it as a more practical multitask alternative without sacrificing ASR performance. Index T erms --multitask training, speech recognition, speaker change detection, named entity recognition, language identification, XLSR-Transducer . Multitask learning enhances automatic speech recognition (ASR) by enabling multiple tasks in a single inference step, improving efficiency and functionality.

machine learning, natural language, tokenverse, (18 more...)

arXiv.org Artificial Intelligence

2508.19856

Country:

Europe > Switzerland (0.28)
North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR

Kumar, Shashi, Madikeri, Srikanth, Zuluaga-Gomez, Juan, Nigmatulina, Iuliia, Villatoro-Tello, Esaú, Burdisso, Sergio, Motlicek, Petr, Pandia, Karthik, Ganapathiraju, Aravind

arXiv.org Artificial IntelligenceJul-5-2024

In traditional conversational intelligence from speech, a cascaded pipeline is used, involving tasks such as voice activity detection, diarization, transcription, and subsequent processing with different NLP models for tasks like semantic endpointing and named entity recognition (NER). Our paper introduces TokenVerse, a single Transducer-based model designed to handle multiple tasks. This is achieved by integrating task-specific tokens into the reference text during ASR model training, streamlining the inference and eliminating the need for separate NLP models. In addition to ASR, we conduct experiments on 3 different tasks: speaker change detection, endpointing, and NER. Our experiments on a public and a private dataset show that the proposed method improves ASR by up to 7.7% in relative WER while outperforming the cascaded pipeline approach in individual task performance. Additionally, we present task transfer learning to a new task within an existing TokenVerse.

arxiv preprint arxiv, hypothesis, tokenverse, (15 more...)

arXiv.org Artificial Intelligence

2407.04444

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.95)

Add feedback