GenAI Processors is a lightweight Python library that enables efficient, parallel content processing.
# GenAI Processors Library 📚
[](LICENSE)
[](https://pypi.org/project/genai-processors/)
[](https://google-gemini.github.io/genai-processors/)
**Build Modular, Asynchronous, and Composable AI Pipelines for Generative AI.**
GenAI Processors is a lightweight Python library that enables efficient,
parallel content processing. It addresses the fragmentation of LLM APIs through
three core pillars:
1. **Unified Content Model**: A single, consistent representation for inputs
and outputs across models, agents, and tools.
2. **Processors**: Simple, composable Python classes that transform content
streams using native `asyncio`.
3. **Streaming**: Asynchronous streaming capabilities built-in by default,
without added plumbing complexity.
At the ecosystem's core lies the `Processor`, which encapsulates a unit of work.
Through a "dual-interface" pattern, it handles the complexity of asynchronous,
multimodal data streaming while exposing a simple API to developers:
```python
from typing import AsyncIterable
from genai_processors import content_api
from genai_processors import processor
class EchoProcessor(processor.Processor):
# The PRODUCER interface (for the processor author):
# Takes a robust ProcessorStream as input, and yields part types.
async def call(
self, content: content_api.ProcessorStream
) -> AsyncIterable[content_api.ProcessorPartTypes]:
# Process content as it streams in!
async for part in content:
yield part
```
Applying a `Processor` is just as straightforward. The CONSUMER interface
accepts wide, forgiving input types and returns a powerful stream that can be
awaited entirely or streamed chunk-by-chunk:
```python
# The CONSUMER interface (for the caller):
# Provide input effortlessGoogle's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.