Multi-modal Generative Media Skills for AI Agents (Claude Code, Cursor, Gemini CLI). High-quality image, video, and audio generation powered by muapi.ai.
# 🎭 Generative Media Skills for AI Agents **The Ultimate Multimodal Toolset for Claude Code, Cursor, and Cursor CLI.** A high-performance, schema-driven architecture for AI agents to generate, edit, and display professional-grade images, videos, and audio — powered by the [muapi-cli](https://github.com/SamurAIGPT/muapi-cli). [🚀 Get Started](#-quick-start) | [🎨 Expert Library](#-expert-library) | [⚙️ Core Primitives](#-core-primitives) | [🤖 MCP Server](#-mcp-server) | [📖 Reference](#-schema-reference) --- ## ✨ Key Features - **🤖 Agent-Native Design** — CLI-powered scripts with structured JSON outputs, semantic exit codes, and `--jq` filtering for seamless agentic pipelines. - **🧠 Expert Knowledge Layer** — Domain-specific skills that bake in professional cinematography, atomic design, and branding logic. - **⚡ CLI-Powered Core** — All primitives delegate to [`muapi-cli`](https://www.npmjs.com/package/muapi-cli) — no curl, no JSON parsing, no boilerplate. - **🖼️ Direct Media Display** — Use the `--view` flag to automatically download and open generated media in your system viewer. - **📁 Local File Support** — Auto-upload images, videos, faces, and audio from your local machine to the CDN for processing. - **🌈 100+ AI Models** — One-click access to **Midjourney v7, Flux Kontext, Seedance 2.0, Kling 3.0, Veo3**, and more. - **🔌 MCP Server** — Run `muapi mcp serve` to expose all 19 tools directly to Claude Desktop, Cursor, or any MCP-compatible agent. --- ## 🏗️ Scalable Architecture This repository uses a **Core/Library** split to ensure efficiency and high-signal discovery for LLMs: ### ⚙️ Core Primitives (`/core`) Thin wrappers around [`muapi-cli`](https://github.com/SamurAIGPT/muapi-cli) for raw API access. - `core/media/` — File upload - `core/edit/` — Image editing (prompt-based) - `core/platform/` — Setup, auth & result polling ### 📚 Expert Library (`/library`) High-value skills that translate creative intent into technical directives. - **C
Agent that generates comprehensive documentation, API references, architecture diagrams, and developer onboarding guides from existing code.
Agent configuration for systematic bug investigation that traces issues from error logs through the codebase to root cause with suggested fixes.
Agent for integrating third-party APIs including SDK setup, type generation, error handling, retry logic, and rate limit management.
Cursor's built-in autonomous coding agent that can make multi-file edits, run terminal commands, search the codebase, and iteratively build features with minimal human intervention.
Cloud-based autonomous coding agent that runs in the background on remote sandboxed environments, handling complex multi-step tasks while you continue working.
Cursor's multi-file editing agent within Composer mode that can create, edit, and delete files across your entire project in a single conversation.