TypeScript AI SDK for building AI agents with any LLM — OpenAI, Claude, Gemini & more. Open-source alternative to Claude Agent SDK.
# Open Agent SDK **Open-source TypeScript SDK for building AI agents** — a provider-agnostic, modular alternative to the [Gemini Agent SDK](https://platform.gemini.com/docs/en/agent-sdk/typescript.md). Built on the [Vercel AI SDK](https://sdk.vercel.ai/) (v6+), so it works with **any LLM**: Anthropic Claude, OpenAI GPT, Google Gemini, Mistral, and more. > **Why Open Agent SDK?** The official Gemini Agent SDK is powerful but vendor-locked and opaque. Open Agent SDK gives you the same agent loop capabilities — tool use, multi-step reasoning, sub-agents, context compaction — with full control, any LLM provider, and an install-only-what-you-need package architecture. ## Features - **Provider-agnostic** — swap between Gemini, GPT-4, Gemini, or any Vercel AI SDK provider - **Modular packages** — use only what you need; no mandatory cloud dependencies - **Sandbox-agnostic** — run locally, on E2B, on Vercel Firecracker, or bring your own - **Full TypeScript** — strict types, Zod schemas, generic type parameters throughout - **Agent loop built-in** — step management, budget tracking, context compaction, stop conditions - **Sub-agent support** — spawn isolated agents for parallel or delegated work - **Skills system** — composable behavior modules via the [Agent Skills](https://github.com/anthropics/agent-skills) standard - **Tool caching** — LRU cache wrapper for any tool, out of the box ## Architecture ``` open-agent-sdk/ ├── packages/ │ ├── core/ # Agent loop, types, caching, utilities │ ├── sandbox-local/ # Local filesystem + shell sandbox │ ├── sandbox-e2b/ # E2B cloud sandbox │ ├── sandbox-vercel/ # Vercel Firecracker sandbox │ ├── cli/ # `oa` — standalone CLI coding agent │ ├── tools/ # Standard agent tools (Bash, Read, Write, Edit, Glob, Grep, …) │ ├── tools-web/ # Web tools (WebSearch, WebFetch) via parallel-web │ └── skills/ # Agent Skills standard (discovery, parsin
Google's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.