2 agents available in the Gemini directory
The open benchmark for AI agent task execution. Claude Code vs Gemini CLI — who wins? Live leaderboard inside.
Extensible benchmarking suite for evaluating AI coding agents on web search tasks. Compare native search vs MCP servers (You.com, expanding) across multiple agents (Claude Code, Gemini, Droid, Codex, expanding) with automated Docker workflows and statistical analysis.