multi-modal-researcher

Name: multi-modal-researcher
Author: aryangautm

aryangautm July 5, 2025

4 copies 0 downloads

ScholarAI: multi-agent ai deep researcher built using langgraph and gemini-2.5 models

🎓 ScholarAI: Your Personal AI Research & Podcast Studio

Effortlessly transform any research topic into a comprehensive, cited report and a studio-quality podcast. ScholarAI leverages a sophisticated, event-driven backend and multi-agent workflows powered by LangGraph and Google Gemini 2.5 Pro to automate the entire research-to-content pipeline.

This isn't just a script; it's a production-grade, asynchronous system designed for scalability, resilience, and real-time user feedback.

✨ Key Features

⚡ Instantaneous API Response: The API uses an event-driven architecture with Kafka to accept jobs instantly, providing a non-blocking user experience.
🧠 Multi-Modal Intelligence: Goes beyond text by analyzing YouTube videos to extract deep insights, enriching the final research report.
🌐 Real-time Web Search: Integrates Gemini's native Google Search tool to ground its research in up-to-the-minute, real-world data with citations.
🎙️ Multi-Speaker Podcast Generation: Creates engaging, conversational podcasts with distinct speaker voices using Gemini's advanced TTS capabilities.
🔴 Live Job Tracking: Users receive real-time status updates pushed from the server via WebSockets, from PENDING to COMPLETED.
🔒 Secure, On-Demand Artifacts: Generated reports and podcasts are stored securely in the cloud and accessed via temporary, pre-signed URLs, ensuring only the owner can downloa

Comments

More Agents

View all

agentic-ai

Agentsmith

Universal, model-agnostic operating harness for AI agents (Claude, Codex, Gemini, …) — a lean core + work-type profiles assembled by one setup script.

PromptPartner

308

agent-skills

Awesome Gamedev Agent Skills

Game-development Agent Skills for AI coding agents: install once and a master router loads the right skill for your engine and task. 66 original, version-pinned skills (plus a master router) in the portable SKILL.md format that runs across Claude Code, Cursor, Codex, Copilot, Gemini CLI and more, for Godot, Unity, Unreal, web and beyond.

gamedev-skills

303

ai-agents

Agentpet

A desktop pet for macOS & Windows that monitors your AI coding agents (Claude Code, Codex, Cursor, Gemini...) in real time, and grows as you code, feed it tokens, level it up, climb the leaderboard.

ntd4996

279

ai-agent

UltraGameStudio

UltraGameStudio - AI coding agent for game development: engine workflows, gameplay code, and asset generation.

wellingfeng

260

Zero

The coding agent that answers to you, your model, your machine, your rules.

Gitlawb

1,099

agent-bridge

Lucarne

Stop babysitting local AI agents. Just notifications, approve, and resume your Codex,Pi,Grok, or Claude code sessions anywhere. 0-Intrusion mobile control bridge via Telegram/微信/飞书. No hooks, no skills, no MCP.

tuchg

314