Loading...
Loading...
Loading...
Parses a structured video script, extracts all `Narrator:` blocks, and synthesises them into a single MP3 using Azure OpenAI TTS.
# Pitch Narration Generator Parses a structured video script, extracts all `Narrator:` blocks, and synthesises them into a single MP3 using Azure OpenAI TTS. ## How it works 1. Split the script on `Time: ...` section headings 2. Extract each `Narrator:` block 3. Synthesise each block via Azure OpenAI TTS → `output/section_NN.mp3` 4. Concatenate with a short pause between sections → `output/narration.mp3` ## Setup ### 1. Install dependencies ```bash pip install openai python-dotenv pydub ``` > `pydub` requires **ffmpeg** for MP3 export: > ```bash > brew install ffmpeg # macOS > ``` ### 2. Configure credentials ```bash cp .env.example .env ``` Fill in your Azure OpenAI TTS resource key and endpoint. ## Usage ```bash python video_gen.py script.txt ``` Your script file should use this format: ``` Time: 0:00-0:30 | Section Title (Visual: optional stage direction — ignored) Narrator: The text that will be spoken aloud. Time: 0:30-1:00 | Next Section Narrator: More narration here. ``` ## Output ``` output/ ├── section_00.mp3 ← per-section audio clips ├── section_01.mp3 │ … └── narration.mp3 ← final stitched narration ``` ## Configuration | Setting | Where | |---------|-------| | TTS voice | `voice=` in `synthesise()` — options: `alloy`, `echo`, `fable`, `nova`, `onyx`, `shimmer` | | TTS speed | `speed=` in `synthesise()` | | Pause between sections | `PAUSE_MS` constant |
[](http://colab.research.google.com/github/rinongal/stylegan-nada/blob/main/stylegan_nada.ipynb)
get signed picture and voice authorisations from our parents
<img src="https://img.shields.io/github/forks/artkulak/text2youtube.svg">