Python SpeechRecognition
From wikinotes
Speech to text Command processing, supporting multiple back-ends.
Documentation
official github https://github.com/Uberi/speech_recognition#readme realpython tutorial https://realpython.com/python-speech-recognition/
Install
sudo pip install SpeechRecognition # if using CMUsphinx pacaur -S sphinxbase # CMUsphinx pacaur -S pocketsphinx # libpocketsphinx (C++) sudo pip install pocketsphinx # pocketsphinx (python)
Usage
Overview
Get your device index using SpeechCommand (pyaudio on backend).
import speech_recognition # misleading name, lists all audio devices # in order of their device-indexes. speech_recognition.Microphone.list_microphone_names() #> ['HD Audio Pro', ...]Listen to text on a loop, and process commands. (in this instance we are using python pocketsphinx audio processing backend).
device_index = 0 mic = speech_recognition.Microphone(device_index) recognizer = speech_recognition.Recognizer() while True: with mic as source: recognizer.adjust_for_ambient_noise(source) audio = recognizer.listen(source) text = recognizer.recognize_sphinx(audio) # see other recognize_* methods for other backends print(text)Backend Notes
CMUsphinx
CMUsphinx converts JSGF to FSG wherever a JSGF file is used for grammar.
Unfortunately, the implementation only works for a JSGF file with a single grammar/rule each sharing the same name.
You can create avoid this issue by creating your own FSG file using.
sphinx_jsgf2fsg -fsg out.fsg < grammar.jsgf