Vakya-AI

Name: Vakya-AI
Author: Yasaswini38

Yasaswini38 August 15, 2025

6 copies 0 downloads

Vākya AI — A real-time AI voice agent that listens, understands and talks back . Powered by FastAPI, WebSockets, AssemblyAI, Gemini, and Murf AI.

Markdown

Vākya AI 🗣️

Vākya, which means "sentence" in Sanskrit, is a conversational AI that allows users to interact with a Large Language Model (LLM) using their voice. Speak. Understand. Reply. That’s Vākya , a full loop of human-like conversation, built with Python, driven by FastAPI, and delivered in a crisp, modern UI.

Wanna know How I look??

Features

Voice-to-Voice Interaction: Talk to the AI and hear it respond back in real time.
Real-time Transcription: See your speech transcribed instantly on screen.
Conversational Memory: Maintains context within a session for natural dialogue.
Session Management: Start new chats or revisit past conversations.
Voice Customization: Choose from multiple voices for responses.
Fun Skills Built-in: Weather updates, News headlines, Jokes on demand.
Modern UI: Cute & aesthetic responsive interface with chat history panel.

Technologies Used

FastAPI — backend framework for REST + WebSocket support
Jinja2 — template rendering for frontend
AssemblyAI — speech-to-text (transcription)
Google Gemini — LLM for text generation
Murf.ai — text-to-speech (streaming natural voices)
python-dotenv — for managing API keys securely
HTML, CSS, JavaScript — responsive UI (with chat history, persona selection)

Architecture

The application follows a simple client-server architecture:

User’s voice recorded via browser microphone
Audio streamed to FastAPI backend over WebSocket
AssemblyAI transcribes speech → text
Transcribed text + history → Gemini (LLM) for response generation
Response text → Murf.ai → natural voice audio stream
Frontend shows transcription, AI’s reply, and plays back

Comments

More Agents

View all

agentic-ai

Agentsmith

Universal, model-agnostic operating harness for AI agents (Claude, Codex, Gemini, …) — a lean core + work-type profiles assembled by one setup script.

PromptPartner

308

agent-skills

Awesome Gamedev Agent Skills

Game-development Agent Skills for AI coding agents: install once and a master router loads the right skill for your engine and task. 66 original, version-pinned skills (plus a master router) in the portable SKILL.md format that runs across Claude Code, Cursor, Codex, Copilot, Gemini CLI and more, for Godot, Unity, Unreal, web and beyond.

gamedev-skills

303

ai-agents

Agentpet

A desktop pet for macOS & Windows that monitors your AI coding agents (Claude Code, Codex, Cursor, Gemini...) in real time, and grows as you code, feed it tokens, level it up, climb the leaderboard.

ntd4996

279

ai-agent

UltraGameStudio

UltraGameStudio - AI coding agent for game development: engine workflows, gameplay code, and asset generation.

wellingfeng

260

Zero

The coding agent that answers to you, your model, your machine, your rules.

Gitlawb

1,099

agent-bridge

Lucarne

Stop babysitting local AI agents. Just notifications, approve, and resume your Codex,Pi,Grok, or Claude code sessions anywhere. 0-Intrusion mobile control bridge via Telegram/微信/飞书. No hooks, no skills, no MCP.

tuchg

314