The GitHub Action for Promptfoo. Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
# Github Action for LLM Prompt Evaluation This Github Action uses [promptfoo](https://www.promptfoo.dev) to produce a before/after view of edit prompts. When you change a prompt, an eval will automatically be posted on the pull request: <img width="650" alt="pull request llm eval" src="https://github.com/typpo/promptfoo-action/assets/310310/ec75fb39-c6b1-4395-9e41-6d66a7bf8657"/> The provided link opens the promptfoo web viewer, which allows you to interactively explore the before vs. after: <img width="650" alt="promptfoo web viewer" src="https://github.com/typpo/promptfoo-action/assets/310310/d0ef0497-0c1a-4886-b115-1ee92680891b"/> ## Supported Events This action supports multiple GitHub event types: - **Pull Request** (`pull_request`, `pull_request_target`) - Compares changes between base and head branches - **Push** (`push`) - Compares changes between commits *(requires v1.1.0+)* - **Manual Trigger** (`workflow_dispatch`) - Allows manual evaluation with custom inputs *(requires v1.1.0+)* > **Note:** Version v1.0.0 only supports `pull_request` events. To use `push` or `workflow_dispatch` events, please use `@v1` (which now points to v1.1.0+) or explicitly use `@v1.1.0`. ## Configuration The action can be configured using the following inputs: | Parameter | Description | Required | | -------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | | `config` | The path to the configuration file. This file contains settings for the action. | Yes | | `github-token` | The Github token. Used to authenticate requests to the Github API.
Google's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.