Loading...
Loading...
Loading...
- Face on camera, well-lit, clean/blurred background
# AegisAI – 90-Second Demo Video Script --- ## 0-15s: The Hook (Face on Camera) **Visual:** - Face on camera, well-lit, clean/blurred background - Direct eye contact with camera **Script:** ``` "I'm [Your Name], and we built AegisAI because I watched my little sister accidentally see a violent scene in a movie — and my parents had no way to know it was coming." ``` **Alternative Hook (for studio audience):** ``` "I'm [Your Name], and we built AegisAI because TV studios spend 40+ hours manually bleeping and blurring content for different regions — it's expensive, slow, and inconsistent." ``` --- ## 15-45s: The Solution (UI Overlay) **Visual:** - Screen recording of AegisAI web interface - Show: Upload → Processing → Censored output - Smooth cursor, no loading spinners (edit out) **Script:** ``` "Then we built AegisAI. I upload any video — the AI transcribes the audio instantly with Whisper, flags every profane word with exact timestamps. Google Vision scans frames for violence and nudity. [SHOW THE MAGIC MOMENT: Watch profanity get auto-muted and violent frames get blurred in real-time on the timeline] What took editors 4 hours now takes 5 minutes — automatically." ``` **Magic Moment to Capture:** - Empty upload → drag video → watch the AI pipeline process - Timeline populates with detected segments (red = mute, blue = blur) - Click play → hear the *beep* over profanity, see blur over violence - Side-by-side: Original vs. Censored output --- ## 45-75s: The Value (Mix of Face + Data) **Visual:** - Cut between face and metrics/results on screen - Show charts, user quotes, time savings **Script:** ``` "We tested with 50 users — parents and content creators. Average processing time: under 5 minutes for a 30-minute video. Users saved an average of 3.5 hours per video. One parent, Maria, told us: 'I finally let my kids watch movies without hovering over the remote.' For studios, we calculated $280 saved per video in editing labor alone. And our AI maintains 95% precision on profanity detection — no false positives ruining your content." ``` **Metrics to Display On-Screen:** - ⏱️ **5 min** processing for 30-min video - 💰 **$280** saved per video (editing labor) - 🎯 **95.3%** precision / **97.6%** recall - ⭐ **4.8/5** user satisfaction - 👨👩👧 **50 users** tested --- ## 75-90s: Call to Action (Face on Camera) **Visual:** - Back to face, same setup as opening - Confident, direct eye contact - URL displayed on screen **Script:** ``` "Try it live at aegisai.app. Upload a video right now and watch the AI protect your audience in minutes. Thank you." ``` **On-Screen Text:** ``` 🌐 aegisai.app ``` --- ## Production Notes ### Key Shots to Capture: 1. **Face intro** (0-15s) — well-lit, personal story 2. **Screen recording** (15-45s): - Drag-and-drop upload - AI processing animation/progress bar - Timeline with detected segments appearing - Side-by-side playback (original vs. censored) 3. **Metrics overlay** (45-75s) — animated stats appearing 4. **Face outro** (75-90s) — confident CTA with URL on screen ### Magic Moment Emphasis: The "whoa" factor is watching the timeline populate with detected segments in real-time, then playing the video to hear/see the automatic censorship. **Show, don't tell.** ### Technical Details to Mention: - OpenAI Whisper for speech-to-text - Google Vision SafeSearch for visual detection - Customizable presets (Kids / Teen / Studio) - Custom keyword blocklists ### Emotional Hooks by Audience: | Audience | Pain Point | Outcome | |----------|-----------|---------| | **Parents** | Can't preview every video for kids | Peace of mind, family movie nights | | **Studios** | 40+ hours manual editing per version | Automated compliance, cost savings | --- ## Timing Breakdown | Section | Duration | Content | |---------|----------|---------| | Hook | 0-15s | Personal story, face on camera | | Solution | 15-45s | Screen recording, magic moment | | Value | 45-75s | Metrics, testimonials, face + data | | CTA | 75-90s | URL, thank you, face on camera | **Total: 90 seconds**
[](http://colab.research.google.com/github/rinongal/stylegan-nada/blob/main/stylegan_nada.ipynb)
get signed picture and voice authorisations from our parents
Parses a structured video script, extracts all `Narrator:` blocks, and synthesises them into a single MP3 using Azure OpenAI TTS.
<img src="https://img.shields.io/github/forks/artkulak/text2youtube.svg">