Python Colab for speech recognition with wav2vec2. Since wav2vec2 requires heavy GPU I've come up with a way to run this on Google Colab as well as local machines with minimum GPU.
Punctuation and Capitalization is also used for better usderstanding of sentences.
For Testing:
Open in colab. Make sure to connect colab with GPU. Paste any youtube ID here : YOUTUBE_ID = 'Paste Here' | Or you can upload your own file in colab.(Make adjustments as necessary) Run You'll find your transcription as well as Word Error Rate(WER) and Character Error Rate(CER)