Build a custom speech-to-text model with speaker diarization capabilities

Nov-3-2021, 05:40:30 GMT–#artificialintelligence

In this code pattern, learn how to train a custom language and acoustic speech-to-text model to transcribe audio files to get speaker diarized output when given a corpus file and audio recordings of a meeting or classroom. One feature of the IBM Watson Speech to Text service is the capability to detect different speakers from the audio file, also known as speaker diarization. This code pattern shows this capability by training a custom language model with a corpus text file, which then trains the model with'Out of Vocabulary' words as well as a custom acoustic model with the audio files, which train the model with'Accent' detection in a Python Flask run time. Get detailed instructions in the README file. This code pattern is part of the Extracting insights from videos with IBM Watson use case series, which showcases the solution on extracting meaningful insights from videos using Watson Speech to Text, Watson Natural Language Processing, and Watson Tone Analyzer services.

custom speech-to-text model, speaker diarization capability, speech-to-text model, (4 more...)

#artificialintelligence

Nov-3-2021, 05:40:30 GMT

News Web Page

Add feedback

Industry:
- Information Technology (0.91)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found