A modular AI agent framework with secure CLI and toolkit architecture
# Proto-Agent
An educational AI agent framework demonstrating capability-based security and modular toolkit architecture. Built for learning secure AI agent patterns with human oversight and permission controls.
## Features
- **Capability-based security** with granular permission controls
- **CLI tool** with human-in-the-loop approval for dangerous operations
- **Python framework** for building custom agents with programmatic control
- **Modular toolkits** for file operations, system monitoring, and version control
- **Educational focus** - clear, readable code demonstrating AI agent security patterns
## Quick Start
### Installation
```bash
pip install proto-agent
# or if you prefer the cli to be used from anywhere
uv tool install proto-agent # Recommended
# or using pipx
pipx install proto-agent
```
### Configuration
```bash
proto-agent --help # View CLI options, which include your config path for your OS
# Example config path for Linux: ~/.config/proto-agent/ will have .env file and config.toml
```
For Model configuration, please refer to the [Litellm documentation](https://docs.litellm.ai/docs/providers) for your exact name of the model you want to use.
### CLI Usage
```bash
# Safe read-only analysis
proto-agent "Analyze this codebase structure" ./my_project --read-only
# Interactive execution with approval prompts
proto-agent "Run the test suite" ./my_project
# Prompts: "Allow execution of function 'run_python_file'? (y/N):"
```
### Framework Usage
```python
from proto_agent import Agent, AgentConfig
from proto_agent.tool_kits import FileOperationToolkit
# Autonomous mode - no human approval needed
agent = Agent(AgentConfig(
api_key="your_api_key",
working_directory="./my_project",
tools=[FileOperationToolkit(
enable_read=True,
enable_write=False, # Disable risky operations
enable_execute=False
).tool]
))
response = agent.generate_content("Analyze this project's structure")
print(response.text)
```
#Google's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.