Mozilla deepspeech
From wikinotes
deepspeech is a speech-to-text engine written by mozilla.
Documentation
official github https://github.com/mozilla/DeepSpeech official discourse forums (very helpful) https://discourse.mozilla.org/c/deep-speech intro tutorial https://progur.com/2018/02/how-to-use-mozilla-deepspeech-tutorial.html NOTE:
Deepspeech's python bindings do not come with documentation, Use java binding documentation, and documentation provided by cli. https://github.com/mozilla/DeepSpeech/blob/master/native_client/java/libdeepspeech/src/main/java/org/mozilla/deepspeech/libdeepspeech/DeepSpeechModel.java
Install
NOTE:
There are two variations of tensorflow available - one that uses NVIDIA gpus, and one that uses the cpu only. I haven't had success installing
cuda/deepspeech-gpu
(reports missing library that exists in /opt/cuda)# CPU version sudo pip install deepspeech # check installed version deepspeech --version # download/extract models for your version https://github.com/mozilla/DeepSpeech/releases # ex: deepspeech-0.5.1-models.tar.gz tar -xvf deepspeech-0.5.1-models.tar.gz
Usage
Commandline
deepspeech \ --model models/output_graph.pb \ --alphabet models/alphabet.txt \ --audio /var/tmp/audio.wav \ > text.txtPython