How to get started with Coqui's open source on-device speech to text tool
I think the transformative power of on-device speech to text is criminally under-rated (and I'm not alone), so I'm a massive fan of the work Coqui are doing to make the technology more widely accessible. Coqui is a startup working on a complete open source solution to speech recognition, as well as text to speech, and I've been lucky enough to collaborate with their team on datasets like Multilingual Spoken Words. They have have great documentation already, but over the holidays I've been playing around with the code and I always like to leave a trail of breadcrumbs if I can, so in this post I'll try to show you how to get speech recognition running locally yourself in just a few minutes. I've tried it on my PopOS 21.04 laptop, but it will hopefully work on most modern Linux distributions, and should be trivial to modify for other platforms that Coqui provide binaries for. To accompany this post, I've also published a Colab notebook, which you can use from your browser on almost any system, and demonstrates all these steps.
Apr-6-2022, 13:00:00 GMT
- Technology: