An interactive AI voice agent that can capture and transcribe speech in real-time, generate intelligent responses using the DeepSeek R1 (7B model) AI, and convert the responses back to natural speech for immediate playback. The agent maintains conversation context and supports cross-platform usage on macOS, Linux, and Windows.
# DeepSeek R1 AI Voice Agent A real-time AI voice assistant powered by DeepSeek R1 that enables seamless voice conversations through speech-to-text transcription, AI response generation, and text-to-speech synthesis. ## 🌟 Overview This project creates an interactive AI voice agent that: - Captures and transcribes speech in real-time using AssemblyAI - Generates intelligent responses using DeepSeek R1 (7B model) via Ollama - Converts AI responses back to natural speech using ElevenLabs - Streams audio responses for immediate playback ## ✨ Features - **Real-time Speech Recognition**: High-quality speech-to-text transcription with AssemblyAI - **Advanced AI Responses**: Powered by DeepSeek R1's reasoning capabilities - **Natural Voice Synthesis**: Professional text-to-speech with ElevenLabs - **Streaming Audio Playback**: Low-latency audio streaming for responsive conversations - **Conversation Memory**: Maintains context throughout the conversation - **Cross-platform Support**: Works on macOS, Linux, and Windows ## 🔧 Prerequisites ### API Keys Required - **AssemblyAI API Key**: [Get your free API key](https://www.assemblyai.com/?utm_source=youtube&utm_medium=referral&utm_campaign=yt_smit_28) - **ElevenLabs API Key**: [Sign up for ElevenLabs](https://elevenlabs.io/) ### System Dependencies #### Install Ollama Download and install Ollama from [ollama.com](https://ollama.com/) #### Install PortAudio **Ubuntu/Debian:** ```bash sudo apt update && sudo apt install portaudio19-dev ``` **macOS:** ```bash brew install portaudio ``` **Windows:** PortAudio is typically included with the Python package installation. #### Install MPV (macOS only) ```bash brew install mpv ``` ## 📦 Installation ### 1. Clone the Repository ```bash git clone https://github.com/danieladdisonorg/DeepSeek-R1-Voice-Agent.git cd DeepSeek-R1-Voice-Agent ``` ### 2. Install Python Dependencies ```bash pip install "assemblyai[extras]" ollama elevenlabs ``` ### 3. Download DeepSeek R1 Model
HAL 分层混合模型工作流 — 强模型(Claude)负责理解/拆解/验收,低成本模型(DeepSeek)负责检索/提取/清洗。Hermes Agent skill。
An LLM agent fine-tuned on DeepSeek for spaced repetition, dynamically integrating knowledge points based on the Ebbinghaus forgetting curve.
基于 STM32F103 构建的端到端 AI 智能手表生态。自研“零重定位”原生机器码动态加载引擎与页面栈式 UI 框架;集成生产级 OTA 回滚保护机制与高带宽(921600 baud)串口协议栈。通过 Node.js 中继实现 DeepSeek AI 语义控制及 ASRPRO 语音全双工交互,是一个集成了分布式计算、现代存储管理与 AI Agent 的嵌入式全栈工程。
A Meta-Agent-Driven Self-Evolving Multi-Agent System for UAV Detection and Tracking
One command to run Hermes AI Agent with a browser UI. Zero prerequisites. 一行命令,AI 就位。
网页应用Agent,接入DeepSeek、Mimo等模型