PocketSphinx Speech Recognition

Overview

This project implements a basic speech recognition system using PocketSphinx. Currently, it serves as a template that captures audio input, converts it into text, and demonstrates command execution using recognized speech.

Installation

Follow these steps to install and build the project:

Clone the repository:

cd ~
git clone https://github.com/cmusphinx/pocketsphinx.git

Install dependencies:

sudo apt install \
     ffmpeg \
     libasound2-dev \
     libportaudio2 \
     libportaudiocpp0 \
     libpulse-dev \
     libsox-fmt-all \
     portaudio19-dev \
     sox

Build and install PocketSphinx:

mkdir build
cd build
cmake ..
cmake --build .
sudo cmake --build . --target install

Build and Run

Build the Program

make

Build with other languages: RU/DE

make MODEL_DE=1

# or

make MODEL_RU=1

Run the Program

./live

Clean the Build

make clean

Features

Uses PocketSphinx for speech recognition.
Captures audio input via SoX.
Converts speech to text.
Demonstrates command execution using recognized phrases.
Example commands:
- Saying "browser" opens a web browser.
- Saying "browser exit" closes the browser.
The system is designed as a foundation for further development and testing of voice commands.

Dependencies

PocketSphinx
SoX (for audio capture)
PortAudio (for real-time audio processing)

Additional Resources

Github: https://github.com/cmusphinx/pocketsphinx
Official documentation: https://cmusphinx.github.io
Documentation for C language: https://cmusphinx.github.io/doc/pocketsphinx/index.html
Acoustic and Language Models: https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/

Language models

Go to Acoustic and Language Models website and download language models:

'language model' Examples: cmusphinx-voxforge-de.lm.bin ru.lm
'dictionary': Examples: cmusphinx-voxforge-de.dic ru.dic
'hidden markov model': Examples: voxforge.cd_ptm_5000 zero_ru.cd_cont_4000

Save them in one of the folders: MODEL_DE, MODEL_RU, MODEL_UA...

You can also find these models here: https://drive.google.com/drive/folders/1UAlBpDFsMTmH69C1u_6yrGoLEp1GN9ov

Then rename them into dictionary.dic, hmm, language_model.lm or if some of them are binary: dictionary.dic.bin, language_model.lm.bin.

Then use them inside load_modals() function like this:

#ifdef MODEL_UA
    printf("\n");
    printf("✅ MODEL_UA flag is defined.\n");
    if (check_models_files(&is_run_default_setup, "MODEL_UA") == 1)
    {
        printf("Loading...\n");
    }
    printf("\n");
#endif

#ifdef MODEL_DE
    printf("\n");
    printf("✅ MODEL_DE flag is defined.\n");
    if (check_models_files(&is_run_default_setup, "MODEL_DE") == 1)
    {
        printf("Loading...\n");
    }
    printf("\n");
#endif

Here instead of Loading...:

  printf("Loading...\n");

Load real models:

  ps_config_set_str(speech_config, "hmm", "MODEL_DE/hmm");
  ps_config_set_str(speech_config, "lm", "MODEL_DE/language_model.lm.bin");
  ps_config_set_str(speech_config, "dict", "MODEL_DE/dictionary.dic");

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.vscode		.vscode
MODEL_UA		MODEL_UA
an4		an4
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
live.c		live.c
steps.txt		steps.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PocketSphinx Speech Recognition

Overview

Installation

Build and Run

Build the Program

Build with other languages: RU/DE

Run the Program

Clean the Build

Features

Dependencies

Additional Resources

Language models

About

Releases

Packages

Languages

liubomyr123/speech-recognition-assistant

Folders and files

Latest commit

History

Repository files navigation

PocketSphinx Speech Recognition

Overview

Installation

Build and Run

Build the Program

Build with other languages: RU/DE

Run the Program

Clean the Build

Features

Dependencies

Additional Resources

Language models

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages