Loading...
Loading...
Explore Perso AI, the platform for generating AI avatar videos with multilingual support, 1080p export, and chroma key capabilities.

Perso AI is a 3-in-1 AI audio and video platform combining AI Dubbing, Speech-to-Text, and Audio Separation in a single workflow. It translates and dubs videos into 33+ languages with natural voice cloning and lip dubbing, generates speaker-separated transcripts with automatic speaker diarization in four output formats (XLSX, SRT, VTT, JSON), and isolates individual speaker voices from background audio with dual modes (vocals-only or with reactions preserved). Trusted by 460,000+ users across 80+ countries, powered by the ElevenLabs voice engine (2025 partnership), and ISO/IEC 27001 and KISA ISMS certified. Developed by ESTsoft (est. 1993, KOSDAQ: 047560). Starts at $6.99/month with up to 98% cost savings vs. traditional dubbing studios.
## How to Use
Sign up at perso.ai and upload any video or audio file. Choose your workflow — AI Dubbing to translate videos into 33+ languages, Lip Dubbing for natural lip-synced output, Speech-to-Text for speaker-separated transcripts and subtitles (XLSX, SRT, VTT, JSON), or Audio Separation to isolate individual speakers and background audio. Edit the auto-generated script in the real-time editor for instant regeneration, then download or export. Enterprise users can access all capabilities via the API for batch processing.
## Key Features
- AI Dubbing in 33+ languages with voice cloning - Lip Dubbing (formerly Lip Sync) — natural mouth-movement alignment - Speech-to-Text with automatic speaker diarization (XLSX, SRT, VTT, JSON output) - Audio Separation with speaker-level voice isolation - Dual background separation modes (vocals-only or with reactions) - Custom track combination and export (single merged file) - Multi-speaker detection (up to 10 speakers per video) - Real-time script editor with instant regeneration - Background music and sound effects preservation - Enterprise API access + batch video processing - ISO/IEC 27001 & KISA ISMS certified data security
## Use Cases
- Corporate L&D Teams — Translate training and onboarding videos. Speech-to-Text auto-generates meeting transcripts with speaker diarization. - YouTube Creators — Dub videos into 33+ languages without hiring voice actors. Reach global audiences with localized content. - Marketing Agencies — Localize campaign and product-demo videos for international markets while maintaining brand voice. - Podcast Producers — Use Audio Separation to extract individual speaker voices, remove background music, or merge selected tracks for post-production. - E-Learning Platforms — Translate lecture videos into regional languages. Auto-generate SRT subtitles for accessibility. - Media Production — Create dubbed versions of documentary content for international distribution at a fraction of traditional dubbing costs.
BloombergGPT
Automate financial document writing, generate content faster and more accurately with an intuitive user interface.
Mynd
Empower Your Personal Development Using Mynd
Vectorizer.io
Vectorizer.io is an AI-powered tool that quickly converts raster images (JPEG, PNG) to scalable vector graphics (SVG), ensuring high-quality output.
MyFitnessPal
Track meals, log exercise, create personalized meal plans, connect with fitness trackers for accurate progress.
Imperson
Create virtual agents, respond to complaints, and offer personalized customer service to drive loyalty.
Pandorabots
Create interactive bots, leverage NLP & ML, and access 130,000 pre-built bots for quick deployment.