How I Built a Cross-Tool Memory and Skill System for…

10 min read | Intermediate

I use four AI coding tools daily: Claude Code, Cursor, Codex CLI, and occasionally Gemini/Antigravity. Each one is good at different things. Claude Code handles complex multi-file refactors. Cursor is fast for inline edits. Codex runs background tasks. Gemini brings a different perspective.

But here's the problem: each tool starts every conversation from zero. It doesn't know my preferences, my project architecture, or the lessons I learned last week in a different tool. I end up repeating myself constantly — "we switched email providers last sprint", "always create a branch before editing", "use pnpm not npm".

So I built a system that gives all four tools a shared brain. They share memory (what I've learned), skills (how to do things), and rules (what to always/never do) — without any tool knowing about the others directly.

This post walks through the exact setup. Everything here is running in production on my projects right now.

The Architecture

                     Knowledge Graph (MCP)
                     localhost:8765
                    ┌─────────────────┐
                    │  Entities        │
                    │  Observations    │
                    │  Relations       │◄──── All 4 tools read/write
                    └────────┬────────┘
                             │
          ┌──────────────────┼──────────────────┐
          │                  │                  │
    Claude Code          Cursor            Codex CLI
    ┌──────────┐     ┌──────────┐      ┌──────────┐
    │ CLAUDE.md│     │.cursorrules│    │ AGENTS.md │
    │ skills/  │     │ .cursor/  │    │ .codex/   │
    │ commands/│     │  rules/   │    │           │
    │ agents/  │     │           │    │           │
    │ memory/  │     │           │    │           │
    └──────────┘     └──────────┘    └──────────┘
          │                  │                  │
          └──────────────────┼──────────────────┘
                             │
                    Skills MCP Server
                    (universal template)
                    ┌─────────────────┐
                    │  51 skills      │
                    │  13 commands    │
                    │  17 agents      │
                    └─────────────────┘

Three layers:

Knowledge Graph — shared memory across all tools (user preferences, project context, cross-project learnings)
Universal Template — reusable skills, commands, and agents available to any project
Per-Tool Config — tool-specific instructions (CLAUDE.md, .cursorrules, AGENTS.md)

Layer 1: The Knowledge Graph (Shared Memory)

The foundation is a knowledge graph running as an MCP (Model Context Protocol) server on localhost:8765. Every AI tool connects to it. It stores three types of data:

Entities — things that exist: projects, people, tools, concepts.

Observations — facts attached to entities: "prefers pnpm over npm", "Project A uses Supabase + Resend", "Project B required ISO 13482 compliance".

Relations — connections between entities: "Project A uses Supabase", "User works-on Project C".

Why Not Just Files?

File-based memory (like Claude Code's ~/.claude/projects/<project>/memory/) is project-scoped and tool-scoped. It works well for "remember this for next time in this project with this tool." But it can't share context across projects or tools.

The knowledge graph solves both problems:

Cross-project: Working on auth in Project B? Search the graph for "authentication" and find patterns you established in Project A.
Cross-tool: Fix a bug in Cursor, and the graph remembers the root cause. Next time Claude Code encounters something similar, it finds the insight.

The Read Strategy: Scoped, Not Full Dump

The most important lesson I learned: never load the entire graph. As it grows, dumping everything into context wastes tokens on irrelevant data.

Instead, I use scoped searches at two moments:

Session start — two targeted searches:

search_nodes("user preferences workflow")  → load personal profile
search_nodes("<current project name>")     → load current project context

Mid-conversation — on-demand when I need cross-project context:

search_nodes("auth authentication")  → find patterns from other projects
search_nodes("ROS2 robotics")        → pull relevant robotics knowledge

The Write Strategy: Continuous Mining

I configured all tools to continuously mine conversations for durable knowledge and write to the graph immediately — not batch it for later.

Two categories of triggers:

Artifact triggers — when something is created:

New skill created → create_entities with name, purpose, key rules
Architecture decision made → add_observations to project entity
New tool/library adopted → create_entities + create_relations

Conversation triggers — when something is said:

User corrects approach → add_observations to user entity with preference
Debugging insight reveals pattern → add_observations
Cross-project learning → add_observations to both project entities

What NOT to save: ephemeral task details, in-progress debugging, things derivable from code or git history.

Layer 2: The Universal Template (Shared Skills)

I maintain a universal template at ~/universal-claude-template/ with 51 skills, 13 commands, and 17 agents. Any project can access these via the Skills MCP server.

What's a Skill vs. a Command vs. an Agent?

Skills are deep procedural knowledge — multi-step workflows with detailed instructions. Examples:

write-blog — full blog creation pipeline (research → outline → write → translate → publish)
create-prd — product requirements document generation
debugging — systematic debugging methodology
financial-analysis — financial modeling and analysis
competitive-research — market research synthesis

Commands are user-invocable shortcuts (/command):

/commit — conventional commit workflow
/explore — deep codebase exploration (read-only)
/fix-issue — end-to-end GitHub issue resolution
/deploy-check — pre-deployment validation

Agents are specialized sub-agents for parallel work:

researcher — codebase/domain exploration (never writes code)
code-reviewer — catches bugs and security issues
architect — design decisions and trade-off analysis
test-runner — runs tests and fixes failures

How Skills Load Across Tools

The Skills MCP server makes all skills available to any connected tool. When Claude Code needs a workflow:

search_skills("blog")       → finds write-blog skill
read_skill("write-blog")    → loads the full SKILL.md

Then it follows the skill's instructions as if it were a local file.

Skill Sync: Keeping the Template Current

When a new generalizable skill is created in any project:

diff_skills("/path/to/project")                          → see what's new
sync_skill("skill-name", "to_template", "/path/to/project") → copy to template

This keeps the universal template growing from real project work, not hypothetical planning.

Layer 3: Per-Tool Configuration

Each tool gets its own configuration format, but they all reference the same knowledge graph and skills.

Claude Code: CLAUDE.md + Rules + Memory

~/.claude/
├── CLAUDE.md              ← Global instructions (knowledge graph + skills MCP setup)
├── settings.json          ← MCP server configs, permissions
└── projects/
    └── <project>/
        ├── CLAUDE.md      ← Project-specific instructions
        └── memory/        ← File-based memory (project-scoped)
            ├── MEMORY.md  ← Index of memory files
            ├── user_profile.md
            └── feedback_*.md

The global CLAUDE.md tells Claude Code how to use the knowledge graph and skills MCP. Project-level CLAUDE.md files add project-specific context (stack, commands, gotchas).

Claude Code also has .claude/rules/ — always-loaded constraint files:

quality.md — definition of done, naming conventions
architecture.md — separation of concerns, error handling
git-workflow.md — branching strategy, commit format
security.md — secrets handling, input validation
testing.md — TDD workflow, mocking rules

Cursor: .cursorrules + .cursor/rules/

Cursor uses .cursorrules (project root) and .cursor/rules/ (MDC format). I mirror the same rules from the universal template, adapted for Cursor's syntax.

Codex CLI: AGENTS.md

OpenAI's Codex CLI reads AGENTS.md for its principal agent instructions. Same content, different format.

The Dual Memory Pattern

Claude Code has a unique advantage: it supports both file-based memory AND the knowledge graph. I use both intentionally:

File-based memory (~/.claude/projects/<project>/memory/) for:

Claude Code-specific context that other tools don't need
Project-scoped feedback and preferences
Quick recall within the same tool

Knowledge graph (localhost:8765) for:

Cross-project patterns and learnings
User preferences all tools should know
Architecture decisions and their rationale
Tool/library evaluations

The rule: if only Claude Code needs it, use file memory. If any other tool should know, use the graph.

Domain Packs: Specializing Per Project

The universal template includes domain packs that add domain-specific skills:

bash ~/universal-claude-template/setup.sh web       # Web dev skills
bash ~/universal-claude-template/setup.sh robotics  # ROS2, hardware, sim2real
bash ~/universal-claude-template/setup.sh creative  # Design, brand, content
bash ~/universal-claude-template/setup.sh finance   # Financial modeling
bash ~/universal-claude-template/setup.sh research  # Academic methodology

Each pack adds relevant rules, skills, and context without bloating projects that don't need them.

What This Actually Looks Like in Practice

Here's a real workflow:

Start a session in Claude Code on a portfolio project. It auto-loads project memory, searches the graph for recent context.
I say "replace the email provider with Resend". Claude Code reads the existing code, designs the migration, implements it across backend + frontend + Docker + CI. It uses the deployment skill for Docker patterns and database-patterns for the Supabase schema.
Mid-implementation, it discovers a Node 18 compatibility issue with crypto.randomUUID(). It writes this to the knowledge graph as a cross-project learning: "crypto.randomUUID() is not a global in Node 18 — use require('crypto').randomUUID()."
Later, in Cursor, I'm working on a different project that also runs Node 18. Cursor searches the graph, finds the crypto.randomUUID observation, and avoids the same mistake.
I switch to Codex CLI for a background task. It reads from the same graph, knows my preferences, and follows the same commit conventions.

No context was lost. No preferences were repeated. Each tool contributed to and benefited from the shared brain.

Setting This Up Yourself

Prerequisites

Claude Code installed
An MCP-compatible memory server (I use @anthropic/memory-mcp)
Optional: Cursor, Codex CLI

Step 1: Set Up the Knowledge Graph

# Install the memory MCP server
npm install -g @anthropic/memory-mcp

# Add to Claude Code
claude mcp add memory -- npx @anthropic/memory-mcp --port 8765

Step 2: Clone the Universal Template

git clone https://github.com/your-username/universal-claude-template ~/universal-claude-template

Or build your own. Start with 5-10 skills that match your actual workflow, then grow organically.

Step 3: Configure Global CLAUDE.md

Create ~/.claude/CLAUDE.md with instructions for the knowledge graph (how to read, when to write) and the skills MCP (how to search and load).

Step 4: Set Up Per-Project Config

cd your-project
bash ~/universal-claude-template/setup.sh web  # or robotics, creative, etc.

This copies rules, creates the CLAUDE.md template, and links skills.

Step 5: Connect Other Tools

Add the memory MCP server to Cursor and Codex CLI using their respective config formats. They'll share the same knowledge graph.

Lessons Learned

Start small. Don't create 50 skills on day one. Create them as you need them, from real work. My template grew from 5 skills to 51 over three months.

Scope your reads. Loading the entire graph into context is worse than having no graph. Always search for specific topics.

Let tools teach each other. The most valuable graph entries come from debugging sessions — insights that prevent the same mistake in a different context.

Don't fight the tool. Each AI tool has different strengths. Claude Code is best for complex refactors. Cursor for quick edits. Codex for background jobs. The shared brain lets you use the right tool without losing context.

Skills are better than prompts. A well-written SKILL.md file with clear steps, constraints, and examples outperforms even the best prompt. Prompts are ephemeral. Skills are durable.

I use this exact setup across all my projects. If you want help setting up a cross-tool AI workflow for your team, let's talk.

Originally published at padawanabhi.de

How I Built a Cross-Tool Memory and Skill System for AI-Assisted Development