Loading...
Loading...
Contentful is a distributed video generation pipeline built with a microservices architecture. The system transforms text topics into fully-rendered videos through a series of AI-powered stages.
# Contentful Architecture Overview
## System Architecture
Contentful is a distributed video generation pipeline built with a microservices architecture. The system transforms text topics into fully-rendered videos through a series of AI-powered stages.
```mermaid
graph TB
CLI[CLI/API Client] --> ORCH[Orchestrator Service]
ORCH --> DB[(MongoDB)]
ORCH --> CACHE[(Redis)]
ORCH --> ING[Ingestion Stage]
ING --> SCRIPT[Scripting Stage]
SCRIPT --> ASSET[Asset Gathering]
ASSET --> VOICE[Voice Generation]
VOICE --> TIME[Timeline Building]
TIME --> REND[Renderer Service]
ING --> WIKI[Wikipedia API]
ING --> WEB[Web Scraper]
SCRIPT --> LLM[OpenAI GPT-4]
ASSET --> PEXELS[Pexels API]
ASSET --> DALLE[DALL-E 3]
VOICE --> TTS[ElevenLabs]
VOICE --> ASR[Whisper]
REND --> MP4[Video Output]
```
## Core Components
### 1. Orchestrator Service
- **Purpose**: Coordinates the entire pipeline, manages job lifecycle
- **Technology**: Python, FastAPI, AsyncIO
- **Port**: 8000
- **Responsibilities**:
- Job queue management
- Pipeline stage coordination
- Provider initialization
- Error recovery and retries
- Progress tracking
### 2. Renderer Service
- **Purpose**: Composes final video from timeline
- **Technology**: Python, MoviePy, FastAPI
- **Port**: 8001
- **Responsibilities**:
- Timeline to video conversion
- Scene composition
- Transition effects
- Audio mixing
- Subtitle embedding
### 3. MongoDB Database
- **Purpose**: Persistent storage for jobs and data
- **Collections**:
- `jobs`: Job metadata and status
- `storyboards`: Generated scripts
- `timelines`: Video timelines
- `assets`: Media asset references
- **Indexes**: status, created_at, updated_at
### 4. Redis Cache
- **Purpose**: Caching and temporary data
- **Use Cases**:
- API response caching
- Media asset URLs
- Provider rate limiting
- Session management
## Pipeline Stages
### Stage 1: Ingestion
Fetches and processes source content.
**Input**: Topic + Source (Wikipedia/Web)
**Output**: Research bundle (2000+ words)
**Process**:
1. Search for relevant content
2. Extract and clean text
3. Gather citations and metadata
4. Validate word count minimum
### Stage 2: Scripting
Generates video script using LLM.
**Input**: Research bundle + Template
**Output**: Storyboard with beats
**Process**:
1. Create template-specific prompt
2. Generate structured storyboard
3. Validate beat structure
4. Add visual guidance
### Stage 3: Asset Gathering
Collects media assets for visuals.
**Input**: Storyboard beats
**Output**: Downloaded media files
**Process**:
1. Search for relevant images/videos
2. Score relevance with CLIP
3. Download highest scoring assets
4. Track attribution
### Stage 4: Voice Generation
Creates narration audio.
**Input**: Beat narration text
**Output**: Audio files + timestamps
**Process**:
1. Synthesize speech with TTS
2. Generate word-level timestamps
3. Create subtitle files
4. Calculate durations
### Stage 5: Timeline Building
Constructs video timeline.
**Input**: Assets + Audio + Storyboard
**Output**: Timeline JSON
**Process**:
1. Align audio with visuals
2. Apply Ken Burns effects
3. Add text overlays
4. Set transitions
### Stage 6: Rendering
Produces final video file.
**Input**: Timeline
**Output**: MP4/WebM video
**Process**:
1. Load media assets
2. Apply effects and transitions
3. Mix audio tracks
4. Encode video
## Data Flow
```yaml
1. User Request:
Topic: "History of AI"
Template: Documentary
Duration: 90 seconds
2. Research Bundle:
Word Count: 3500
Sources: 5 Wikipedia articles
Images: 15 references
3. Storyboard:
Beats: 6 (intro, 4 body, outro)
Total Narration: 850 words
Visual Cues: 18 search queries
4. Assets:
Images: 12 downloaded
Videos: 3 clips
Music: 1 background track
5. Voice Files:
Narration: 6 MP3 files
Duration: 88 seconds total
Subtitles: 6 SRT files
6. Timeline:
Scenes: 6
Transitions: 5 fade
Total Duration: 90 seconds
7. Output:
Format: MP4
Resolution: 1920x1080
Size: ~45MB
```
## Provider Architecture
### Provider Interface
All providers implement a common interface:
```python
class Provider:
name: str
async def initialize()
async def process()
async def cleanup()
```
### Provider Types
**LLM Providers**:
- OpenAI (GPT-4, GPT-4-Vision)
- Claude (future)
- Local LLMs (future)
**TTS Providers**:
- ElevenLabs
- OpenAI TTS (future)
- Azure Speech (future)
**Media Providers**:
- Pexels
- Unsplash (future)
- Pixabay (future)
**ASR Providers**:
- Whisper (local)
- Whisper API (future)
**Image Generation**:
- DALL-E 3
- Midjourney (future)
- Stable Diffusion (future)
## Scalability Design
### Horizontal Scaling
- Stateless services enable horizontal scaling
- Load balancer distributes requests
- MongoDB replica sets for data redundancy
- Redis cluster for cache distribution
### Vertical Scaling
- Async processing maximizes CPU utilization
- Memory-efficient streaming for large files
- GPU acceleration for rendering (optional)
### Performance Optimizations
- Concurrent pipeline stage execution
- Asset download parallelization
- Caching at multiple levels
- Connection pooling for databases
## Error Handling
### Retry Strategy
```yaml
Retryable Errors:
- Network timeouts
- Rate limits
- Temporary API failures
Retry Policy:
- Max Attempts: 3
- Backoff: Exponential
- Max Delay: 30 seconds
```
### Failure Recovery
- Job state persistence enables resumption
- Partial progress saved at each stage
- Failed jobs can be manually retried
- Automatic cleanup of orphaned resources
## Security Architecture
### API Security
- Rate limiting per IP/API key
- Input validation and sanitization
- SQL/NoSQL injection prevention
- XSS protection
### Data Security
- API keys stored in environment variables
- Sensitive data masked in logs
- Secure file upload validation
- Path traversal prevention
### Network Security
- HTTPS enforcement
- CORS configuration
- Request timeout limits
- DDoS protection (cloudflare)
## Monitoring & Observability
### Health Checks
- `/health` endpoints on all services
- Database connection monitoring
- Redis availability checks
- Disk space monitoring
### Metrics
- Job completion rate
- Average processing time
- Error rates by stage
- Resource utilization
### Logging
- Structured JSON logging
- Log levels: DEBUG, INFO, WARN, ERROR
- Centralized log aggregation
- Error tracking with context
## Deployment Architecture
### Docker Compose (Development)
```yaml
Services:
- orchestrator: Port 8000
- renderer: Port 8001
- mongodb: Port 27017
- redis: Port 6379
Networks:
- contentful_network
Volumes:
- mongodb_data
- redis_data
- media_storage
```
### Kubernetes (Production)
```yaml
Deployments:
- orchestrator (3 replicas)
- renderer (2 replicas)
StatefulSets:
- mongodb (3 replicas)
- redis (3 replicas)
Services:
- LoadBalancer for API
- ClusterIP for internal
Storage:
- PersistentVolumes for data
- Object storage for media
```
## Technology Stack
### Backend
- **Language**: Python 3.11+
- **Frameworks**: FastAPI, Pydantic
- **Async**: AsyncIO, aiohttp, aiofiles
- **Video**: MoviePy, FFmpeg
- **Database**: Motor (MongoDB), redis-py
### Infrastructure
- **Containers**: Docker, Docker Compose
- **Orchestration**: Kubernetes (production)
- **CI/CD**: GitHub Actions
- **Monitoring**: Prometheus, Grafana
### AI/ML
- **LLM**: OpenAI GPT-4
- **TTS**: ElevenLabs
- **ASR**: Whisper
- **Vision**: CLIP, GPT-4-Vision
- **Generation**: DALL-E 3
## Design Patterns
### Architectural Patterns
- **Microservices**: Loosely coupled services
- **Pipeline**: Sequential processing stages
- **Repository**: Data access abstraction
- **Factory**: Provider instantiation
### Code Patterns
- **Dependency Injection**: Provider configuration
- **Strategy**: Swappable providers
- **Observer**: Progress notifications
- **Circuit Breaker**: API failure handling
## Future Architecture Considerations
### Planned Enhancements
1. **Message Queue**: RabbitMQ/Kafka for job queue
2. **Workflow Engine**: Temporal/Airflow for complex pipelines
3. **CDN**: CloudFront for media delivery
4. **ML Pipeline**: Kubeflow for model serving
5. **Multi-region**: Geographic distribution
### Scalability Roadmap
1. **Phase 1**: Current - Single region, Docker Compose
2. **Phase 2**: Kubernetes, horizontal scaling
3. **Phase 3**: Multi-region, CDN integration
4. **Phase 4**: Edge computing, global distributionFull-stack web application for the University of Guelph Rocketry Club featuring AI-powered chatbot, member management, project showcases, and sponsor integration.
Reactory Data (`reactory-data`) is the data, assets, and CDN repository for the Reactory platform. It provides baseline directory structures, fonts, themes, internationalization files, client plugin source code and runtime bundles, email templates, workflow schedules, database backups, AI learning resources, and static content.
globs: src/app/**/*.tsx src/components/**/*.tsx src/hooks/**/*.ts src/lib/**/*.ts
A TypeScript CLI application that initiates and maintains an autonomous conversation between two AI personas using Ollama. The app starts with user input and then continues the conversation automatically until stopped.