AudioWise is a command-line interface (CLI) tool for extracting audio from video files, transcribing the audio using the Whisper model, and correcting the text using the LanguageTool library. It supports individual video files or folders containing multiple files.
Ensure the following prerequisites are met before using AudioWise:
- Python 3.8 or higher - Download Python here.
- FFmpeg - Required for audio extraction.
- Install on Linux:
sudo apt install ffmpeg
- Install on macOS:
brew install ffmpeg
- For Windows, download from ffmpeg.org.
- Install on Linux:
- Required Python libraries:
- Install dependencies using:
pip install -r requirements.txt
- Install dependencies using:
Clone the repository or copy the files to your machine:
git clone https://github.com/your-username/audiowise-cli.git
cd audiowise-cli
Ensure the main script is executable:
chmod +x main.py
The general syntax for AudioWise CLI is:
python main.py -i <input> -o <output> [options]
-i
or--input
: Path to the input video or a folder containing video files.-o
or--output
: Path to the output file or folder where the transcription will be saved.-m
or--model
: Whisper model to use. Available options:tiny
,base
,small
,medium
,large
(default:base
).-l
or--language
: Language for grammatical correction (default:pt-BR
).
python main.py -i video.mp4 -o transcription.txt -m small -l pt-BR
python main.py -i ./videos -o ./transcriptions -m base -l en-US
python main.py -i https://example.com/video.mp4 -o transcription.txt -m tiny -l es
- Process logs are displayed directly in the terminal so you can monitor the progress of each step.
- Already processed files are skipped, and a warning is logged.
- When pressing
Ctrl+C
or if the system terminates the process, the script automatically cleans up all temporary files created during execution.
- Audio Extraction: Converts video audio to mono for better performance.
- Transcription with Whisper: Uses machine learning models to transcribe audio.
- Correction with LanguageTool: Corrects grammar and spelling in the specified language.
- Chunk Processing: Ensures better memory management.
- Supports Multiple Formats: Compatible with popular extensions such as
.mp4
,.mkv
,.avi
, and more.
Ensure FFmpeg is installed and accessible from the terminal.
Verify all Whisper dependencies are installed correctly. Use:
pip install -r requirements.txt
Verify that the language specified with the -l
argument is supported.
Feel free to submit issues and pull requests to the official repository.
This project is licensed under the MIT License.