Open-source multi-agent AI debate arena: pit Claude, GPT, Gemini, Ollama & HuggingFace models against each other with frozen-context fairness, evidence-first judging, 20+ personas, code review, and PDF/Markdown reports. CLI + Web UI.
<div align="center"> # ⚔️ AI Colosseum debate **Multi-Agent Debate Arena — Let AI Models Fight It Out** *Run the same task through multiple model agents, freeze a shared context bundle,* *generate independent plans, run an evidence-first debate, and produce a judge-backed verdict.* [](https://python.org) [](https://fastapi.tiangolo.com) [](LICENSE) **🌐 Language / 언어 / 语言:** **English** · [한국어](README.ko.md) · [中文](README.zh.md) --- 🏛️ **Fair** · 🔍 **Traceable** · 💰 **Cost-Controlled** · 📊 **Evidence-First** · 🔌 **Extensible** </div> <br> ## 🎯 Why Colosseum? > Not just another chatbot UI — Colosseum is a **structured debate platform** designed for real workflows. | Problem | AI Colosseum debate's Answer | |---|---| | "Which model gives a better plan?" | Run them side by side on the **same frozen context** | | "How do I compare fairly?" | Independent plan generation — no agent sees another's plan first | | "Debates go in circles forever" | Bounded rounds with **novelty checks**, convergence detection, and budget limits | | "I can't trace how a decision was made" | Full artifact trail: plans, rounds, judge agendas, adopted arguments, verdicts | | "I want control over judging" | Choose **automated**, **AI judge**, or **human judge** mode | | "I need a code review, not just a debate" | Multi-phase **code review** with 6 configurable review phases | | "I want multiple AI agents to QA my project in parallel" | **QA ensemble mode** — gladiators run in parallel on disjoint GPU slices, then a judge merges their findings into one canonical, deduplicated report | --- ## ✨ Features <table> <tr> <td width="50%" valign="top"> ### 🧊 Frozen Context Bundles Every agent
Google's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.