rag

RAG debugging is harder than I expected

Yuji Ito April 20, 2026

0 views

I've started building a vector database to learn modern vector search for the AI era. In my...

--- title: RAG debugging is harder than I expected published: true description: tags: rag, vectordatabase, pinecone, qdrant # cover_image: https://direct_url_to_image.jpg # Use a ratio of 100:42 for best results. # published_at: 2026-04-20 14:01 +0000 --- I've started building a vector database to learn modern vector search for the AI era. In my professional work, I maintain Jepsen/Antithesis tests for distributed databases and blockchain systems. These tests check system correctness through transactional behaviors under real-world failures. When working on a vector database, I started wondering: what does "correctness" even mean in vector search? By definition, ANN results don't have to exactly match exact search. Some level of approximation is acceptable. In RAG systems, there are evaluation methods — but most of them focus on the final LLM output. When something goes wrong, it's hard to tell: - was it the retrieval? - the prompt? - or the model itself? I wanted to isolate the retrieval layer and understand what actually changed. I changed the embedding model, but I couldn't clearly tell what changed in retrieval results. Some queries looked fine. Some felt off. But I had no systematic way to understand the differences. So instead of trying to judge correctness, I focused on something simpler: What actually changed? I built a small tool to diff retrieval results. https://github.com/yito88/traceowl It captures, compares, and explains differences in VectorDB search results so you can quickly understand what changed and where to focus your review. ![TraceOwl report example](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/1uas0ordi6l5p9c5gk57.png) If you're working on RAG or vector search, I'd love to hear how you evaluate changes in your system.

RAG debugging is harder than I expected

Tags

Comments

More Blog

Minimalist EKS: The Easy Way

Never forget to enter the Stern Grove lottery again!

A Free Screenshot Editor That Never Uploads Your Image

I built a CLI to break my highlights out of Apple Books

A Developer's Guide to Agent Hooks in Antigravity CLI

Tactical vs. Strategic Agentic AI Development — A Playbook for Developers