skillprobe — Cursor Agents | Neura Market
    Neura MarketNeura Market/Cursor
    ChatGPTChatGPTClaudeClaudeGeminiGeminiCursorCursorGrokGrokPerplexityPerplexityDeepSeekDeepSeek
    CoPilotCoPilotStable DiffusionStable DiffusionMidjourneyMidjourney
    View All Directories
    OverviewRulesPromptsMCPsAgentsBlogVideosGuidesCoursesCommunityExtensionsTrendingGenerate
    CursorAgentsskillprobe
    Back to Agents
    skillprobe

    skillprobe

    Anyesh March 31, 2026
    3 copies 0 downloads

    Automated end-to-end testing for AI agent skills(agentskills.io). Launches Claude Code and Cursor as subprocesses, runs scenarios in real workspaces, and asserts what the model actually does.

    Agent Definition
    # skillprobe
    
    [![PyPI version](https://img.shields.io/pypi/v/skillprobe.svg)](https://pypi.org/project/skillprobe/)
    [![Python versions](https://img.shields.io/pypi/pyversions/skillprobe.svg)](https://pypi.org/project/skillprobe/)
    [![Tests](https://img.shields.io/github/actions/workflow/status/Anyesh/skillprobe/test.yml?branch=main&label=tests)](https://github.com/Anyesh/skillprobe/actions/workflows/test.yml)
    [![License](https://img.shields.io/pypi/l/skillprobe.svg)](https://github.com/Anyesh/skillprobe/blob/main/LICENSE)
    
    Release notes: see [CHANGELOG.md](CHANGELOG.md) or the [GitHub Releases page](https://github.com/Anyesh/skillprobe/releases).
    
    ![skillprobe demo](demo/skillprobe-demo.gif)
    
    Automated testing for LLM skills. Launches Claude Code or Cursor as subprocesses, runs scenarios in isolated workspaces, and reports what passed and what didn't.
    
    Skills are just text injected into the LLM context, and LLMs are probabilistic, so they'll get ignored some percentage of the time no matter how carefully you word them. If you want hard enforcement, hooks are the right tool since they run deterministically every time. But hooks can only check things after the fact (linting, file restrictions, blocked commands). They cant guide the model toward better architectural decisions, teach it your team's domain conventions, set the tone of code review feedback, or help it reason through a multi-step workflow. Skills handle that side, and skillprobe measures how reliably they do it.
    
    ## When you need this
    
    If you write a few personal skills and tweak them by feel, you probably dont need this. That loop is fast and good enough for individual use.
    
    Where it breaks down:
    
    - **Model updates break skills silently.** Anthropic ships a new Sonnet, Cursor updates their agent, and a skill that worked last week now produces different output. Nobody notices because nobody retested.
    - **Teams sharing skills.** When 20 engineers share a "code review" skill, one person's gut check isnt repre

    Tags

    agentskillsagentskills-ioai-skillsci-cdclaude-codeclaude-code-skillsclaude-skillscursordeveloper-toolsllm-skills

    Comments

    More Agents

    View all
    documentation

    Documentation & Onboarding Agent

    Agent that generates comprehensive documentation, API references, architecture diagrams, and developer onboarding guides from existing code.

    C
    Community
    debugging

    Cursor Bug Triage Agent

    Agent configuration for systematic bug investigation that traces issues from error logs through the codebase to root cause with suggested fixes.

    C
    Community
    api

    API Integration Agent

    Agent for integrating third-party APIs including SDK setup, type generation, error handling, retry logic, and rate limit management.

    C
    Community
    coding

    Cursor Agent Mode

    Cursor's built-in autonomous coding agent that can make multi-file edits, run terminal commands, search the codebase, and iteratively build features with minimal human intervention.

    C
    Cursor Team
    cloud

    Cursor Background Agent

    Cloud-based autonomous coding agent that runs in the background on remote sandboxed environments, handling complex multi-step tasks while you continue working.

    C
    Cursor Team
    composer

    Cursor Composer Agent

    Cursor's multi-file editing agent within Composer mode that can create, edit, and delete files across your entire project in a single conversation.

    C
    Cursor Team

    Stay up to date

    Get the latest Cursor prompts, rules, and resources delivered to your inbox weekly.

    Neura Market LogoNeura Market

    Discover the best AI prompts, plugins, and resources for Cursor and more.

    Content Types

    • Rules
    • Prompts
    • MCPs
    • Agents
    • Guides

    Platforms

    • ChatGPT Directory
    • Claude Directory
    • Gemini Directory
    • Cursor Directory
    • Grok Directory
    • Perplexity Directory
    • DeepSeek Directory
    • CoPilot Directory
    • Stable Diffusion Directory
    • Midjourney Directory
    • All Directories

    Resources

    • Blog
    • Documentation
    • Help Center
    • Marketplace

    Legal

    • Privacy Policy
    • Terms of Service

    © 2026 Neura Market. All rights reserved.

    |

    Not affiliated with any AI platform vendors.