This repo uses googles python api to turn speech into text. This is something I did fairly quickly in my own time, it's not the best code, but it'll work, feel free to send me a pull request if you want anything changed.
- A file called
creds.jsonplaced at the root of the repo, this is the credentials file provided by google - Ensure
ffmpegworks on your system - Your python environment has access to
google.oauth2andgoogle.cloud.speech_v1p1beta1(pip install something, I don't remember what) - A subdirectory under the root folder called
audios
- Takes the audio files (ending in
mp3) in theaudiossub directory. - Uses
ffmpegto split them into 50s chunks (as there is a limit on the google api) - Sends each segment of the audio to the google api for translation
- Combines the audio sub segments together into an output file
- Removes the temporary 50s audio files