Agents

2 agents available in the Gemini directory

Pre-built AI agents with specialized instructions for specific tasks — from coding and writing to research and analysis. Each agent is ready to deploy with a single click.

agent-evaluation

web-search-agent-evals

Extensible benchmarking suite for evaluating AI coding agents on web search tasks. Compare native search vs MCP servers (You.com, expanding) across multiple agents (Claude Code, Gemini, Droid, Codex, expanding) with automated Docker workflows and statistical analysis.

youdotcom-oss

agent-evaluation

teamcity-ai-agent-testing-demo

End-to-end TeamCity framework to run AI agents on SWE-Bench Lite. Spin up isolated Docker images per task, extract patches, score with the official harness, and aggregate success rates. As an example, we'll look at Junie and Google Gemini CLI

JetBrains