Loading...
Loading...
**Document status:** Draft (spec-driven, implementation-driving)
# PRD: GNO - Local Knowledge Index and Retrieval (CLI + MCP)
**Document status:** Draft (spec-driven, implementation-driving)
**Last updated:** 2025-12-23
**Working product name:** GNO (from “gnosis”)
**Binary / command (default):** `gno`
**Virtual URI scheme (default):** `gno://`
**Primary interfaces:** CLI + MCP server (stdio)
**Implementation stack:** Bun + TypeScript (ESM), SQLite (FTS5), sqlite-vec, node-llama-cpp
**Hard constraint:** No Python dependency anywhere in the toolchain (conversion and inference included)
---
## 1. Summary
GNO is a local-first knowledge indexing and retrieval system that:
- Ingests Markdown and common document formats (DOCX, PPTX, XLSX, PDF, TXT, and more over time).
- Converts non-Markdown files into deterministic Markdown mirrors for indexing, search, and retrieval.
- Provides high retrieval quality via:
- BM25 keyword search (SQLite FTS5)
- vector semantic search (sqlite-vec)
- hybrid search with structured query expansion, parallel retrieval, fusion, reranking, and explainability
- Runs fully on-device:
- No external services required
- No separate LLM runtime required (no Ollama)
- Local inference via node-llama-cpp with GGUF models resolved and cached automatically
- Exposes capabilities through:
- a world-class CLI for humans and scripting
- an MCP server so agent runtimes can query and read local knowledge via tools and `gno://` resources
- Optimizes for first-run success (MVP):
- `gno init` creates config and index DB (migrations) and can register content roots
- `gno index` performs ingestion plus embedding in one command (wrapper over `update` plus `embed`)
- `gno ask` is the human-default query UX (citations-first; optional short grounded synthesis)
- target: first useful result in 3 commands or fewer (`init` -> `index` -> `ask`)
---
## 2. Naming and configurability
### 2.1 Defaults (MVP)
- Tool name: GNO
- CLI: `gno`
- URI scheme: `gno://`
- Directories (defaults, OS-idiomatic; overridable via env vars and `--config` / `--index`):
- Linux (XDG):
- Config dir: `${XDG_CONFIG_HOME:-~/.config}/gno/`
- Data dir (SQLite DB): `${XDG_DATA_HOME:-~/.local/share}/gno/`
- Cache dir (models, temp artifacts): `${XDG_CACHE_HOME:-~/.cache}/gno/`
- macOS:
- Config dir: `~/Library/Application Support/gno/config/`
- Data dir (SQLite DB): `~/Library/Application Support/gno/data/`
- Cache dir (models, temp artifacts): `~/Library/Caches/gno/`
- Windows:
- Config dir: `%APPDATA%\\gno\\config\\`
- Data dir (SQLite DB): `%LOCALAPPDATA%\\gno\\data\\`
- Cache dir (models, temp artifacts): `%LOCALAPPDATA%\\gno\\cache\\`
### 2.2 Renaming is cheap by design
All user-visible identifiers must be centrally configurable in `src/app/constants.ts` (or equivalent), including:
- CLI name
- URI scheme
- config/data/cache directory names
- MCP server name
- MCP tool namespace prefix
### 2.3 Directory resolution rules (MVP)
Resolution precedence:
1. Environment overrides (recommended for tests/CI):
- `GNO_CONFIG_DIR`
- `GNO_DATA_DIR`
- `GNO_CACHE_DIR`
2. Platform defaults from §2.1 (XDG vs Library vs AppData)
Index DB path rule:
- DB file is stored under the resolved data dir as:
- `<dataDir>/index-<indexName>.sqlite`
- `--index <indexName>` selects the file suffix; default index name is `default`.
---
## 3. Goals and non-goals
### 3.1 Goals (MVP)
1. Local-only indexing and search over multiple collections (directories + glob rules).
2. Deterministic conversion to Markdown mirrors suitable for:
- stable indexing
- stable embeddings (given same converter version + model)
- golden conversion fixtures
3. Hybrid retrieval pipeline:
- structured query expansion (lexical + semantic + optional HyDE)
- parallel BM25 and vector retrieval
- fusion via RRF (+ optional top-rank bonus)
- cross-encoder reranking
- position-aware blending to keep results stable
4. Zero extra runtime for inference:
- local embeddings, generation, reranking via node-llama-cpp
- GGUF models auto-resolved and cached
5. Agent-first interfaces:
- CLI structured outputs (JSON, files line protocol, CSV, MD, XML)
- MCP tools/resources with explicit schemas and predictable behavior
6. World-class engineering:
- spec-driven development
- layered architecture (ports/adapters)
- high unit and integration coverage
- deterministic test fixtures
- eval harness and regression gates for stochastic or ranking-sensitive behavior (Evalite v1)
7. World-class first-run UX:
- a user can go from zero to a useful result in 3 commands or fewer (`init` -> `index` -> `ask`)
- defaults work without manual configuration edits
- missing optional components (models, vectors) degrade gracefully with actionable diagnostics
- scripted onboarding is supported via `--yes` (no prompts; safe defaults only)
8. Multilingual-first retrieval:
- storage and retrieval preserve language signals end-to-end (converter -> chunks -> ranking -> outputs)
- default model preset is multilingual and robust across DE/FR/IT/EN
- lexical search remains deterministic, with tokenizer configured per index (DB), and opt-in stemming only where appropriate
### 3.2 Non-goals (MVP)
- GUI application
- Background watch daemon (explicit `gno update` only)
- Cloud indexing, hosted mode, or remote retrieval endpoints (other than MCP over stdio)
- Multi-user collaborative indices
- Editing documents through the tool (read-only retrieval)
- OCR/audio transcription/image understanding (best-effort only if a converter happens to extract it; not required)
---
## 4. Key concepts and invariants
### 4.1 Collection
A named set of files defined by:
- root directory path
- glob pattern(s)
- optional include and exclude rules
- optional update command to run before indexing
- optional contexts (global, per-collection, per-path prefix)
### 4.2 Source file vs Markdown mirror
- Source file: original artifact on disk (example: `contracts/nda.docx`)
- Markdown mirror: deterministic canonical Markdown derived from the source (example: `nda.docx` converted to Markdown)
Invariant:
- Search and retrieval operate on the mirror text.
- Identity, references, and URIs always refer to the source file.
### 4.3 Virtual URI scheme
Stable, transport-friendly identifier:
`gno://<collection>/<relativePath>`
Examples:
- `gno://work/contracts/nda.docx`
- `gno://notes/meetings/2025-11-12.md`
### 4.4 docid
A short handle for quick reference:
- Format: `#<6-8 hex>`
- Derived from `source_hash` (sha256 of source bytes), truncated
- Not stable across edits (edits change source_hash), but stable across converter upgrades if the source bytes are unchanged
Example:
- `#a1b2c3`
### 4.5 Context
Optional human-authored metadata associated with:
- global scope (`/`)
- a collection root (`collection:`)
- a path prefix within a collection (`gno://collection/prefix`)
Context can be used to:
- display richer search results
- optionally improve embeddings/reranking by prepending context to LLM inputs (configurable)
---
## 5. Users and primary workflows
### 5.1 Personas
- Knowledge workers with mixed-format archives (MD + Office + PDFs)
- Engineers building local RAG for personal or team docs
- Agent workflows needing safe, local retrieval (MCP integration)
### 5.2 Golden workflows
Index (fast path, recommended):
- `gno init ~/notes --name notes --pattern "**/*.md"`
- `gno init ~/work/docs --name work --pattern "**/*.{md,pdf,docx,pptx,xlsx,txt}"`
- `gno index`
Index (explicit steps):
- `gno collection add ~/notes --name notes --pattern "**/*.md"`
- `gno collection add ~/work/docs --name work --pattern "**/*.{md,pdf,docx,pptx,xlsx,txt}"`
- `gno update`
- `gno embed`
Ask (human-default):
- `gno ask "termination clause" --collection work`
- `gno ask "wie deployen wir nach staging" --collection work`
Search (lower-level):
- `gno search "termination clause"`
- `gno vsearch "how do we deploy to staging"`
- `gno query "quarterly planning process" --explain`
Retrieve:
- `gno get gno://work/runbooks/oncall.pdf`
- `gno get "#a1b2c3" --line-numbers`
- `gno multi-get "work/runbooks/*" --json`
Agent integration:
- `gno mcp` and tools `gno_search`, `gno_query`, `gno_get`, etc.
---
## 6. Product principles
1. Local truth: source files are authoritative; mirrors are derived.
2. Deterministic interfaces: CLI and MCP outputs are stable, versioned, schema-driven.
3. Explainable retrieval: return docid, URI, path, snippet, scores, and optional line ranges.
4. Fail-soft ingestion: conversion failures do not block indexing; they produce structured errors and actionable diagnostics.
5. Minimal friction for inference: no separate server; models resolved and cached automatically.
6. No Python: all conversion and inference are Node native (Bun/TS) with well-defined adapters.
7. Human-first commands: `init`, `index`, and `ask` provide the simplest on-ramp while preserving lower-level primitives (`collection`, `update`, `embed`, `query`).
8. Multilingual by default: store language hints/tags and use language-aware defaults for embedding, ranking, and explainability.
---
## 7. Functional scope (MVP)
### 7.1 Collections and configuration
Features:
- create/list/remove/rename collections
- per-collection:
- `path` (absolute)
- `pattern` (glob)
- optional `include` extensions allowlist
- optional `exclude` patterns (defaults include `.git`, `node_modules`, `.venv`, `.idea`, `dist`, `build`)
- optional `update` shell command executed during `gno update`
- per-path contexts
- optional `languageHint` (BCP-47 like `de`, `fr`, `it`, `en`, or `und`; hint only, never required)
Index-level settings (per index DB, not per collection):
- `ftsTokenizer`:
- default: `unicode61` (multilingual-safe)
- optional: `porter` (English stemming only; opt-in)
- optional: `trigram` (if available in the SQLite build; higher index size; validated by `gno doctor`)
Config:
- config file path: `<configDir>/index.yml` (resolved per §2.1–§2.3)
- override via `--config <path>` and env vars (for tests/CI)
Onboarding convenience (MVP):
- `gno init` is a thin, ergonomic wrapper over:
- config creation (if missing)
- index DB creation and migrations (if missing)
- `collection add` (optional, if a path is provided)
- printing next steps and environment info (resolved config/data/cache paths and index DB path)
- `gno init` must be safe to run repeatedly (idempotent; no destructive actions without explicit flags)
### 7.2 Ingestion and sync
On `gno update`, per collection:
1. Enumerate files matching rules.
2. For each file:
- stat (mtime, size)
- if size > `maxBytes` limit, record `TOO_LARGE` error and skip (do not read bytes)
- read bytes
- compute `source_hash = sha256(bytes)`
- detect MIME/ext (layered detection)
- select converter adapter
- convert to Markdown mirror
- canonicalize mirror markdown deterministically
- compute `mirror_hash = sha256(canonicalMarkdown)`
3. Upsert document record keyed by `(collection, relativePath)`.
4. Store mirror markdown in `content` keyed by `mirror_hash` (dedupe).
5. Chunk mirror content into `content_chunks`.
- assign `chunk_language` (BCP-47 or `und`) deterministically (see §10.2)
6. Update FTS index over chunks.
7. Mark missing files as inactive (soft delete).
8. Record conversion warnings/errors for diagnostics and status.
Deletion handling:
- If previously indexed file no longer exists, mark document inactive.
- `gno cleanup` removes orphaned content/chunks/vectors not referenced by any active document.
Converter upgrade handling:
- Store `converter_id` and `converter_version` per document.
- If converter version changes (or routing changes), mirror must be regenerated even if source bytes are unchanged.
Human command alias (MVP):
- `gno index` is the recommended human command to build/update an index end-to-end:
- default behavior: run `gno update` then `gno embed`
- `--no-embed` runs ingestion only
- `--collection <name>` scopes ingestion/embedding to a single collection (optional)
- `--models-pull` allows model download (network) before embedding (prompted unless `--yes`)
### 7.3 Supported file types (MVP)
Always supported:
- `.md` (passthrough + canonicalization)
- `.txt` (plaintext to Markdown + canonicalization)
Converter-backed (MVP):
- `.pdf`, `.docx`, `.xlsx` via a Node MarkItDown port adapter (primary choice: `markitdown-ts`)
- `.pptx` via `officeparser` (239 stars, 119K weekly downloads, in-memory extraction)
Future (explicitly supported by architecture, not MVP):
- pdf.js based PDF extraction
- mammoth DOCX conversion
- PPTX parsing
- SheetJS for XLSX
- additional formats via adapter registry
---
## 8. Converter subsystem (Node-only, deterministic)
This section is implementation-driving.
### 8.1 Converter goals
- Single stable conversion API independent of any vendor library
- Deterministic Markdown output (canonicalization rules locked)
- Structured errors and warnings
- Clear routing via MIME detection and registry
- Golden conversion fixtures in the repo
### 8.2 Converter interfaces
`src/converters/types.ts`:
```ts
export type ConverterId =
| "native/markdown"
| "native/plaintext"
| "adapter/markitdown-ts"
| string;
export type ConvertInput = {
sourcePath: string; // absolute
relativePath: string; // within collection
collection: string;
bytes: Uint8Array;
mime: string;
ext: string; // ".pdf"
limits: {
maxBytes: number;
timeoutMs: number;
};
};
export type ConvertWarning = {
code:
| "LOSSY"
| "TRUNCATED"
| "PARTIAL"
| "UNSUPPORTED_FEATURE"
| "LOW_CONFIDENCE";
message: string;
details?: Record<string, unknown>;
};
export type ConvertOutput = {
markdown: string; // canonical markdown (see 8.4)
title?: string;
languageHint?: string; // optional BCP-47 or "und"
meta: {
converterId: ConverterId;
converterVersion: string;
sourceMime: string;
warnings?: ConvertWarning[];
};
};
export type ConvertResult =
| { ok: true; value: ConvertOutput }
| { ok: false; error: ConvertError };
export interface Converter {
readonly id: ConverterId;
readonly version: string;
canHandle(mime: string, ext: string): boolean;
convert(input: ConvertInput): Promise<ConvertResult>;
}
```
### 8.3 Error model
`src/converters/errors.ts`:
```ts
export type ConvertErrorCode =
| "UNSUPPORTED"
| "TOO_LARGE"
| "TIMEOUT"
| "CORRUPT"
| "PERMISSION"
| "IO"
| "ADAPTER_FAILURE"
| "INTERNAL";
export type ConvertError = {
code: ConvertErrorCode;
message: string;
retryable: boolean;
fatal: boolean; // reserved for unrecoverable store corruption, not conversion
converterId: string;
sourcePath: string;
mime: string;
ext: string;
cause?: unknown;
details?: Record<string, unknown>;
};
```
Indexing policy (MVP):
- `UNSUPPORTED`, `TOO_LARGE`, `CORRUPT`: document is indexed as metadata-only and marked non-searchable (no chunks), with a recorded error entry.
- `TIMEOUT`, `ADAPTER_FAILURE`: same as above, plus a warning in `gno status`.
- `IO`, `PERMISSION`: warn, continue.
- Conversion errors never crash the full update unless the DB layer fails.
### 8.4 Canonical Markdown conventions
Canonicalization must be deterministic and independent of machine/time.
Rules:
1. Use `\n` newlines.
2. Strip `\u0000` and other non-printable control chars except `\n` and `\t`.
3. Trim trailing whitespace per line.
4. Collapse 3+ blank lines to exactly 2.
5. Ensure exactly one final newline at end of document.
Important:
- Do not inject run-specific timestamps into the canonical markdown.
- Do not inject absolute paths into the canonical markdown.
- Source references are surfaced via document metadata and CLI/MCP outputs, not embedded into mirror content used for hashing/indexing.
Language hint propagation (MVP):
- If a converter provides `languageHint`, persist it on the document record.
- If a converter does not provide `languageHint`, determine language at chunking time (see §10.2) and store per chunk.
Optional (display-only) header:
- `gno get` and MCP resource reads may optionally prepend a short comment header for agent friendliness, but that header must be generated at read time and must not be used to compute `mirror_hash`.
### 8.5 MIME detection strategy
Layered detection:
1. Extension map (MVP):
- `.md` -> `text/markdown`
- `.txt` -> `text/plain`
- `.pdf` -> `application/pdf`
- `.docx` -> `application/vnd.openxmlformats-officedocument.wordprocessingml.document`
- `.pptx` -> `application/vnd.openxmlformats-officedocument.presentationml.presentation`
- `.xlsx` -> `application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`
2. Lightweight sniffing (MVP+):
- bytes start with `%PDF-` -> PDF
- bytes start with `PK\x03\x04` and ext in {docx,pptx,xlsx} -> OOXML MIME
- fallback to ext map
API:
```ts
export type MimeDetection = {
mime: string;
ext: string;
confidence: "high" | "medium" | "low";
via: "sniff" | "ext" | "fallback";
};
export interface MimeDetector {
detect(path: string, bytes: Uint8Array): MimeDetection;
}
```
### 8.6 Converter registry and routing
Priority order (MVP):
1. `native/markdown`
2. `native/plaintext`
3. `adapter/markitdown-ts` (PDF, DOCX, XLSX)
4. `adapter/officeparser` (PPTX)
Selection:
- choose first converter where `canHandle(mime, ext)` is true
- else return `UNSUPPORTED`
### 8.7 MVP converter adapter: markitdown-ts
Responsibilities:
- enforce `maxBytes` and `timeoutMs`
- prefer path-based conversion if supported by the library
- map library exceptions into `ConvertError`
- emit warnings for suspicious output:
- empty output for non-empty input
- truncation or partial extraction signals
Hard constraint:
- No Python-based MarkItDown. Node adapter only.
### 8.7b MVP converter adapter: officeparser (PPTX)
Responsibilities:
- handle `.pptx` files (markitdown-ts has incomplete PPTX support)
- extract slide text and speaker notes
- format extracted text as Markdown with filename-derived title
- enforce `maxBytes` (defense in depth; EPIC 5 does pre-read stat check)
- map library exceptions into `ConvertError`
Library choice rationale:
- 239 GitHub stars, 119K weekly npm downloads
- In-memory extraction (no disk writes)
- Full TypeScript support
- Active maintenance (Nov 2024 updates)
### 8.8 Converter golden fixture plan
Fixtures:
```
test/fixtures/conversion/
pdf/
simple.pdf
simple.expected.md
docx/
headings.docx
headings.expected.md
xlsx/
sheet1.xlsx
sheet1.expected.md
pptx/
deck1.pptx
deck1.expected.md
md/
sample.md
sample.expected.md
```
Golden tests:
- converter output must match `*.expected.md` exactly
- converter version is locked in `package.json`
- if output changes, update fixtures deliberately
---
## 9. Storage model and schema (SQLite)
### 9.1 Design intent
- Source identity lives in `documents`
- Mirror content is content-addressed in `content` by `mirror_hash`
- Chunking is mirror-derived and stored in `content_chunks`
- FTS indexes chunk text
- Vectors index chunk embeddings by model
- Contexts are separate and applied at query/runtime
### 9.2 Conceptual tables (MVP)
- `collections`
- `name`, `path`, `pattern`, `include`, `exclude`, `update_cmd`, timestamps
- `contexts`
- `scope_type` (global, collection, prefix)
- `scope_key` (e.g., `/`, `collection:work`, `gno://work/prefix`)
- `text`
- timestamps
- `documents`
- `id`
- `collection`
- `rel_path`
- `title`
- `source_abs_path`
- `source_mime`, `source_ext`
- `source_mtime`, `source_size`
- `source_hash` (sha256)
- `docid` (derived from `source_hash`, 8 hex chars)
- `mirror_hash` (soft reference to `content`, NOT an FK - documents are tracked before content exists; `cleanupOrphans()` handles integrity)
- `converter_id`, `converter_version`
- `language_hint` (optional; BCP-47 or `und`)
- `active`
- `last_error_code`, `last_error_message` (optional denormalization)
- timestamps
- `content`
- `mirror_hash` (PK)
- `markdown`
- timestamps
- `content_chunks`
- `(mirror_hash, seq)` (PK)
- `pos` (char offset)
- `text`
- `start_line`, `end_line` (recommended to support stable line-number output)
- `language` (BCP-47 or `und`, derived deterministically)
- `token_count` (optional; if tokenizer is available, for debugging and evals)
- `content_fts` (FTS5 virtual table over chunk text plus doc metadata columns)
- `content_vectors`
- `(mirror_hash, seq, model)` (PK)
- `embedding` stored via sqlite-vec backing
- `embedded_at`
- `llm_cache`
- `key` (PK)
- `value`
- timestamps
- `ingest_errors` (optional but recommended)
- `(collection, rel_path, occurred_at)`
- `code`, `message`, `details_json`
Schema source of truth:
- `spec/db/schema.sql` plus migrations
---
## 10. Chunking and indexing
### 10.1 Chunking requirements
- Token-aware chunking preferred when embedding model tokenizer is available
- Deterministic chunk boundaries for a given (model, text) pair
- Default parameters (configurable):
- `maxTokens = 800`
- `overlap = 15%` (120 tokens)
Fallback (MVP bootstrapping):
- char-based chunking if tokenizer is not available yet (must be deterministic)
### 10.2 Language detection and tagging (multilingual MVP)
Goals:
- stable per-chunk language tags to improve explainability, ranking prompt selection, and optional language filtering
- deterministic results for the same input text and the same detector version
Policy (MVP):
- Each chunk stores `language` as BCP-47 or `und`.
- Language tag derivation order:
1. document-level `language_hint` from converter, if present and not `und`
2. deterministic heuristic detector over chunk text (no network; no stochastic model)
3. optional collection-level `languageHint` (if configured and not `und`)
4. else `und`
Implementation guidance (MVP):
- Use a lightweight, deterministic detector with a pinned implementation version in code.
- Detector must not depend on system locale, current time, or non-deterministic APIs.
### 10.3 Indexing surfaces
- FTS indexes chunk text, not entire documents, to align with vector chunk retrieval
- Snippets returned from the best matching chunk(s), with optional line numbers
### 10.4 FTS tokenizer configuration (per index DB, multilingual MVP)
- FTS5 tokenizer is configured at FTS virtual table creation time.
- Therefore tokenizer selection is per index (DB), not per document or per row.
- Default tokenizer is `unicode61`.
- Optional tokenizers (if available) are validated by `gno doctor`.
- Tokenizer selection is stored in index metadata and reported in `gno status`.
---
## 11. Local inference and model strategy (node-llama-cpp)
### 11.1 Runtime requirements
Local inference uses node-llama-cpp with GGUF models, supporting:
- embeddings
- generation (query expansion and optional HyDE)
- reranking (cross-encoder)
Constraints:
- No separate server process
- Safe lifecycle management:
- keep models loaded for repeated calls when practical
- dispose contexts/sequences promptly
- avoid leaking memory across long MCP sessions
### 11.2 Determinism settings (MVP)
For query expansion / HyDE:
- temperature = 0
- bounded max tokens
- versioned prompt templates
- cache results by `(prompt_version, model_uri, input_hash)`
### 11.3 Model management UX
Cache:
- `<cacheDir>/models` (resolved per §2.1–§2.3)
Commands:
- `gno models list`
- `gno models pull [--all|--embed|--rerank|--gen]`
- `gno models clear`
- `gno models path`
First-run fallback behavior (MVP):
- If embeddings are not available yet (no vectors present and/or embed model missing):
- `gno vsearch` returns a structured error indicating vectors are unavailable and suggests `gno index` or `gno embed`
- `gno query` and `gno ask` degrade gracefully to BM25-only retrieval by default and annotate this in `--explain` and JSON metadata
- If generation/rerank models are missing:
- expansion/HyDE and reranking are skipped (with explainability), but retrieval still returns citations/snippets
### 11.4 Default model presets (config-driven)
Defaults must live in config presets, not in business logic.
Provide at least two presets (MVP):
- Preset A (multilingual, widely used):
- Embedding: bge-m3 GGUF
- Reranker: bge-reranker-v2-m3 (cross-encoder) GGUF where available
- Generation: small Qwen Instruct GGUF for structured expansion/HyDE
- Preset B (Qwen family):
- Embedding: Qwen3 embedding GGUF
- Reranker: Qwen3 reranker GGUF
- Generation: small Qwen3 Instruct GGUF
`gno doctor` must validate:
- model URIs resolvable
- models cached or downloadable
- embedding and rerank dimensions/config match expectations
Default preset rule (MVP):
- Preset A is the default unless the user explicitly selects another preset.
- Rationale: consistent multilingual behavior across DE/FR/IT/EN in a single default path.
---
## 12. Search modes and pipelines
### 12.1 Search commands
- `gno search`: BM25/FTS only (fast, deterministic)
- `gno vsearch`: vector only (semantic)
- `gno query`: hybrid (best quality)
- `gno ask`: human-default wrapper over `gno query` (citations-first; optional grounded short answer)
### 12.2 Structured query expansion (hybrid)
Expansion returns a structured object:
- lexical variants for BM25
- semantic variants for embedding queries
- optional HyDE synthetic snippet (only if valid)
Example shape:
```json
{
"lexicalQueries": ["...", "..."],
"vectorQueries": ["...", "..."],
"hyde": "optional synthetic paragraph",
"notes": "optional debug notes"
}
```
Rules:
- Expansion must be JSON-schema constrained to avoid malformed outputs.
- Expansion is skipped when BM25 is already strong (configurable threshold).
### 12.3 Parallel retrieval
Hybrid retrieval runs in parallel:
- BM25 over:
- original query (weighted)
- lexicalQueries variants
- Vector over:
- original query
- vectorQueries variants
- HyDE if present
### 12.4 Fusion via RRF (+ top-rank bonus)
- Reciprocal Rank Fusion with parameter `k = 60` (configurable)
- Weight original query results higher than expansions
- Optional top-rank bonus to reward agreement across retrieval modes
- Candidate cap for reranking (example: top 50-100 chunks)
### 12.5 Reranking
- Cross-encoder reranker scores candidate chunks (0..1 normalized)
- Rerank inputs include:
- query
- chunk text
- optional title and applicable contexts (configurable)
### 12.6 Position-aware blending
Goal: avoid destabilizing top results due to reranker noise.
Example blending schedule (configurable):
- ranks 1-3: 0.75 retrieval score, 0.25 rerank score
- ranks 4-10: 0.60 retrieval score, 0.40 rerank score
- ranks 11+: 0.40 retrieval score, 0.60 rerank score
### 12.7 Explainability
`gno query --explain` prints to stderr:
- whether expansion was used
- expansion payload (redacted if needed)
- BM25 and vector contribution summary
- fusion and reranking parameters
- final score components per top result
---
## 13. Retrieval
### 13.1 `get`
Retrieves a single document mirror by:
- `gno://...`
- `collection/path`
- `#docid`
- fuzzy filename match (optional, must be deterministic and explainable)
Supports:
- `:line` suffix: `gno get gno://work/contracts/nda.docx:120`
- `--from <line>`
- `-l <maxLines>`
- `--line-numbers`
- `--source` (include abs source metadata in output; does not change mirror content)
Default:
- returns Markdown mirror plus metadata (CLI format)
- in JSON format returns structured object with mirror and source refs
### 13.2 `multi-get`
Retrieves multiple documents by:
- glob pattern
- comma-separated list of refs
- list of docids
Supports:
- `--max-bytes <n>` default 10240 for safety
- `--line-numbers`
- structured “skipped” records when limits exceeded
---
## 14. CLI specification
### 14.1 Global conventions
Exit codes:
- 0 success
- 1 validation / usage error
- 2 runtime failure (IO, DB, conversion, model, etc.)
Global flags:
- `--index <name>` use alternate DB name (multiple indices per machine)
- `--config <path>` override config path
- `--no-color`
- `--verbose`
- `--yes` non-interactive mode: accept safe defaults, never prompt (required for CI and scripted onboarding)
Output format flags (where applicable):
- `--json`
- `--files` (line protocol)
- `--csv`
- `--md`
- `--xml`
### 14.2 Command catalog (MVP)
- `gno status`
- `gno init [<path>] [--name <name>] [--pattern <glob>] [--include <csv-ext>] [--exclude <csv>] [--update <cmd>] [--yes]`
- `gno collection add <path> --name <name> [--pattern <glob>] [--include <csv-ext>] [--exclude <csv>] [--update <cmd>]`
- `gno collection list`
- `gno collection remove <name>`
- `gno collection rename <old> <new>`
- `gno update [--git-pull]`
- `gno index [--collection <name>] [--no-embed] [--models-pull] [--git-pull] [--yes]`
- `gno embed [--force] [--model <embedModelUri>] [--batch-size <n>]`
- `gno search <query> [options]`
- `gno vsearch <query> [options]`
- `gno query <query> [options]`
- `gno ask <query> [options]`
- `gno get <ref> [options]`
- `gno multi-get <pattern-or-list> [options]`
- `gno ls [collection|gno://collection/prefix]`
- `gno context add <path|gno://...|/> "text"`
- `gno context list`
- `gno context check`
- `gno context rm <path|gno://...|/>`
- `gno models list|pull|clear|path`
- `gno cleanup`
- `gno doctor`
- `gno mcp`
### 14.3 Search options (search, vsearch, query)
- `-n <num>` limit (default 5; default 20 for `--json`/`--files`)
- `--min-score <num>`
- `-c, --collection <name>`
- `--full` include full mirror (instead of snippet)
- `--line-numbers`
- `--lang <bcp47>` optional filter or hint:
- for `search`: best-effort filter by chunk language where available
- for `vsearch`/`query`/`ask`: hint for language detection and prompt selection
- default: auto
- hybrid-only:
- `--no-expand`
- `--no-rerank`
- `--explain`
### 14.4 Ask options (MVP)
`gno ask` is a wrapper over `gno query` with citations-first output.
Additional options:
- `--answer` enable a short grounded answer (requires generation model)
- `--no-answer` force retrieval-only output (citations/snippets only)
- `--max-answer-tokens <n>` hard cap for answer generation (default small, config-driven)
---
## 15. Output contracts (schema-driven)
### 15.1 Search result JSON shape
All search-like commands output:
```json
{
"docid": "#a1b2c3",
"score": 0.78,
"uri": "gno://work/contracts/nda.docx",
"title": "Extracted Title",
"snippetLanguage": "de",
"context": "Optional folder context",
"snippet": "Markdown snippet (mirror)",
"snippetRange": { "startLine": 120, "endLine": 145 },
"source": {
"absPath": "/abs/path/to/source.docx",
"relPath": "contracts/nda.docx",
"mime": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"ext": ".docx",
"modifiedAt": "2025-11-12T10:30:00Z",
"sizeBytes": 123456,
"sourceHash": "sha256-hex"
},
"conversion": {
"converterId": "adapter/markitdown-ts",
"converterVersion": "x.y.z",
"mirrorHash": "sha256-hex",
"warnings": []
}
}
```
Contract rules:
- `uri` always refers to the source identity.
- `snippet` comes from mirror content.
- `snippetLanguage` is the best-effort language tag for the snippet range (BCP-47 or `und`).
- `source.absPath` is included when:
- `--source` is set, or
- output is from MCP tools (always include to support agent actions).
### 15.2 `--files` line protocol
One line per result:
`#docid,<score>,gno://collection/path`
Optional extra fields may be appended only in a versioned manner.
### 15.3 Schema artifacts
Repo must include JSON schemas for:
- search result item
- status payload
- get payload
- multi-get payload
- MCP tool structured outputs
- ask payload
### 15.4 `ask` JSON shape (MVP)
`gno ask --json` returns:
```json
{
"query": "string",
"mode": "hybrid",
"queryLanguage": "auto",
"answer": "optional string",
"citations": [
{
"docid": "#a1b2c3",
"uri": "gno://work/path",
"startLine": 120,
"endLine": 145
}
],
"results": [],
"meta": {
"expanded": true,
"reranked": true,
"vectorsUsed": true
}
}
```
Rules:
- `results[]` items are identical to `gno query --json` result item shape.
- `answer` must be grounded in `results` and cite line ranges via `citations`.
- If generation is unavailable or disabled, omit `answer` (do not emit an empty string).
- If vectors are unavailable, set `mode` to `bm25_only` and set `meta.vectorsUsed=false`.
---
## 16. MCP specification
### 16.1 Server
- Command: `gno mcp`
- Transport: stdio
- Must keep DB open for server lifetime for performance
### 16.2 Resources
- `gno://{collection}/{path}`
- returns `text/markdown` mirror content
- may optionally prepend a display-only comment header with:
- uri
- source abs path
- mime
- docid
- language metadata when available (document hint and/or chunk language for returned ranges)
- line numbers default ON for agent friendliness
### 16.3 Tools (stable namespace)
Tool names are stable and versioned under `gno.*`:
- `gno_search` (BM25)
- `gno_vsearch` (vector)
- `gno_query` (hybrid)
- `gno_get`
- `gno_multi_get`
- `gno_status`
Each tool returns:
- `content[]`: human-readable summary strings
- `structuredContent`: machine-readable payloads matching schemas
- `isError: true` on failures (for example vector index missing)
Multilingual MCP requirement (MVP):
- Tool structured outputs include `snippetLanguage` where applicable (mirrors CLI schema).
### 16.4 MCP correctness requirements
- Strictly follow MCP tool response conventions
- Ensure `gno://` URIs are URL-encoded for special characters while preserving path slashes
- Maintain backward compatibility of tool schemas once published (version fields required)
---
## 17. Architecture
### 17.1 High-level design
- Ports and Adapters (hexagonal)
- Functional core / imperative shell
- Result-based error handling (no throws across boundaries)
Core domain responsibilities:
- document identity and source refs
- mirror content and hashing
- chunking
- retrieval orchestration and scoring
Ports:
- `ConverterPort`
- `StorePort`
- `EmbeddingPort`
- `GenerationPort` (for expansion/HyDE)
- `RerankPort`
- `FsPort`, `ClockPort` (testability)
- `MimeDetectorPort`
- `FileWalkerPort`
Adapters:
- SQLite store + migrations
- FTS5 adapter
- sqlite-vec adapter
- node-llama-cpp adapter
- markitdown-ts adapter
- CLI delivery
- MCP delivery
### 17.2 Proposed module layout
(Reference layout, not a constraint, but should remain layered.)
```
src/
app/
config/
domain/
converters/
adapters/markitdownJs/
llm/
nodeLlamaCpp/
store/
migrations/
sqlite/
indexing/
pipeline/
cli/
mcp/
test/
fixtures/
spec/
eval/
spec/
cli.md
mcp.md
db/schema.sql
converters.md
models.md
evals.md
output-schemas/
```
---
## 18. Engineering quality requirements
### 18.1 Tests
Unit tests:
- path normalization
- MIME detection
- canonical Markdown normalization
- chunking boundaries
- RRF and blending math
- query expansion schema validation
- language detector determinism and tagging rules
Integration tests:
- create temp index DB, run migrations
- `gno update` over fixture corpus
- `gno search/vsearch/query/get/multi-get` outputs match schemas
- converter golden fixtures
- MCP server tool contract tests (golden calls)
- multilingual fixture corpus:
- at least one DE and one EN document
- confirm `snippetLanguage` tagging is stable and surfaced in CLI and MCP outputs
### 18.2 Evals (Evalite v1)
Scope:
- ranking quality gates for `vsearch` and `query`
- stability checks for structured expansion outputs
- multilingual ranking sanity checks (DE/FR/IT/EN mixed corpus)
#### 18.2.1 Evalite Setup
File structure:
```
test/
eval/
vsearch.eval.ts # vector search ranking evals
query.eval.ts # hybrid query pipeline evals
expansion.eval.ts # structured expansion stability
multilingual.eval.ts # cross-language ranking
fixtures/
corpus/ # DE/EN/FR/IT test documents
queries.json # query-judgment pairs
evalite.config.ts # global eval configuration
```
Configuration (`evalite.config.ts`):
```ts
import { defineConfig } from "evalite/config";
import { createSqliteStorage } from "evalite/sqlite-storage";
export default defineConfig({
storage: () => createSqliteStorage("./evalite.db"),
testTimeout: 120000, // 2 min for slow LLM calls
maxConcurrency: 10, // parallel test cases
scoreThreshold: 70, // MVP: 70%, tighten over time
cache: true, // cache LLM responses in dev
});
```
#### 18.2.2 Custom Scorers (IR Metrics)
Create reusable scorers for retrieval metrics (not built into Evalite):
```ts
// test/eval/scorers/ir-metrics.ts
import { createScorer } from "evalite";
export const recallAtK = (k: number) =>
createScorer<
{ query: string },
string[], // output: docids
string[] // expected: relevant docids
>({
name: `Recall@${k}`,
description: `Fraction of relevant docs in top ${k} results`,
scorer: ({ output, expected }) => {
const topK = output.slice(0, k);
const hits = expected.filter((id) => topK.includes(id)).length;
return {
score: expected.length > 0 ? hits / expected.length : 1,
metadata: { k, hits, total: expected.length },
};
},
});
export const ndcgAtK = (k: number) =>
createScorer<
{ query: string },
string[],
{ docid: string; relevance: number }[]
>({
name: `nDCG@${k}`,
description: `Normalized DCG at rank ${k}`,
scorer: ({ output, expected }) => {
const relevanceMap = new Map(expected.map((e) => [e.docid, e.relevance]));
const dcg = output.slice(0, k).reduce((sum, docid, i) => {
const rel = relevanceMap.get(docid) ?? 0;
return sum + (Math.pow(2, rel) - 1) / Math.log2(i + 2);
}, 0);
const ideal = [...expected]
.sort((a, b) => b.relevance - a.relevance)
.slice(0, k)
.reduce(
(sum, e, i) =>
sum + (Math.pow(2, e.relevance) - 1) / Math.log2(i + 2),
0
);
return {
score: ideal > 0 ? dcg / ideal : 1,
metadata: { k, dcg, idcg: ideal },
};
},
});
```
#### 18.2.3 Example Eval File
```ts
// test/eval/vsearch.eval.ts
import { evalite } from "evalite";
import { recallAtK, ndcgAtK } from "./scorers/ir-metrics";
import { vsearch } from "../../src/pipeline/vsearch";
evalite("Vector Search Ranking", {
data: async () => {
const queries = await Bun.file("test/eval/fixtures/queries.json").json();
return queries.map((q) => ({
input: { query: q.query, collection: q.collection },
expected: q.relevantDocs,
}));
},
task: async (input) => {
const results = await vsearch(input.query, {
collection: input.collection,
limit: 10,
});
return results.map((r) => r.docid);
},
scorers: [
{ scorer: (args) => recallAtK(5).scorer(args) },
{ scorer: (args) => recallAtK(10).scorer(args) },
{ scorer: (args) => ndcgAtK(10).scorer(args) },
],
trialCount: 1, // deterministic for same embeddings
});
```
#### 18.2.4 Metrics
- recall@k (k=5,10) via custom scorer
- nDCG@k via custom scorer
- latency budgets (soft gate initially, tracked via custom column)
#### 18.2.5 Rules
- golden tests must not depend on exact expanded queries
- eval thresholds must tolerate minor model drift while catching major regressions
- use `trialCount > 1` for non-deterministic tasks (e.g., LLM expansion) to measure variance
- cache LLM responses in dev (`cache: true`) for fast iteration
#### 18.2.6 CLI Usage
```bash
# Dev: watch mode with UI at localhost:3006
bun run evalite watch
# CI: run once, fail if threshold not met
bun run evalite --threshold=70 --outputPath=./eval-results.json
# Export static UI for CI artifacts
bun run evalite export --output=./eval-ui
```
#### 18.2.7 Multilingual Eval Notes (MVP)
- include language-mismatched queries (e.g., DE query over EN doc) to validate vector + rerank behavior
- do not gate on exact expansion text, only on ranking metrics and schema validity
- use `columns` to show `snippetLanguage` for debugging cross-language behavior
See `spec/evals.md` for detailed implementation specification.
---
## 19. Packaging and distribution
Target: **npm registry, Bun-only**
```bash
# Requires Bun runtime
curl -fsSL https://bun.sh/install | bash
# Then run
bunx @gmickel/gno
# Or install globally
bun add -g @gmickel/gno
```
Note: `npm install` / `yarn` / `pnpm` will download the package but execution requires Bun runtime (code uses `bun:sqlite`, bin is `.ts`).
Prebuilt standalone binaries not viable due to native deps (sqlite-vec needs sidecar, node-llama-cpp can't bundle). See `notes/spike-bun-compile.md`.
First-run UX:
- `gno init` guides initial setup and prints what was created and where it lives.
- `gno doctor` guides:
- converter readiness
- sqlite-vec availability
- model resolution and downloads
- embedding backlog
---
## 20. Risks and mitigations
1. Native dependency friction (sqlite-vec, node-llama-cpp)
Mitigation: `doctor`, clear error messages, platform matrix CI, documented rebuild guidance.
2. Converter output drift across library versions
Mitigation: lock converter versions, store converter version per doc, golden fixtures updated deliberately.
3. Reranking latency on CPU
Mitigation: cap candidates, cache rerank results, position-aware blending.
4. Multilingual lexical search limitations
Mitigation:
- default `unicode61` tokenizer for multilingual safety
- opt-in stemming only where appropriate (`porter` for English)
- rely on vector + rerank for cross-language and morphology-heavy cases
- store per-chunk language tags to improve explainability and prompt selection
---
## 21. Spec-driven development workflow
Rule: No implementation merges without spec updates and executable contract tests.
Repo must include:
- `spec/cli.md` ✓
- `spec/mcp.md` ✓
- `spec/db/schema.sql` (placeholder exists)
- `spec/converters.md`
- `spec/models.md`
- `spec/evals.md` (Evalite v1 implementation spec)
- `spec/output-schemas/*.json` ✓
Contract tests: `test/spec/schemas/` (94 tests via Ajv)
Definition of done (per epic):
- specs updated and reviewed
- contract tests and golden fixtures added/updated
- unit and integration tests pass
- eval gates pass where applicable
- CLI help updated
---
## 22. Implementation plan (ordered epics and tasks)
### EPIC 0 - Repo scaffold and naming constants
- T0.1 Bun + TS ESM scaffold, lint/typecheck, test runner baseline
- T0.2 Central constants module for CLI name, URI scheme, dirs, MCP namespace
- T0.3 CI pipeline: typecheck + tests
Acceptance:
- `bun test` passes
- renaming GNO elements is a single-module change
---
### EPIC 1 - Specs and contract tests (freeze interfaces early)
- T1.1 Write `spec/cli.md` (commands, flags, exit codes, output formats)
- T1.2 Write `spec/mcp.md` (tools/resources, schemas, versioning rules)
- T1.3 Write `spec/output-schemas/*.json`
- T1.4 Add contract tests validating JSON outputs against schemas
Acceptance:
- schema contract tests exist before feature implementation expands
---
### EPIC 2 - Config, collections, contexts
- T2.1 Config schema + loader/saver (YAML), XDG defaults, overrides
- T2.2 `collection add/list/remove/rename`
- T2.3 `context add/list/check/rm` including global (`/`) and prefix contexts
- T2.4 `init` command (idempotent):
- create config + DB if missing
- optionally add a collection (same flags as `collection add`)
- support `--yes`
- T2.5 Multilingual config:
- collection-level `languageHint` parsing and validation
- index-level `ftsTokenizer` parsing and validation (per index DB)
Acceptance:
- deterministic config edits, validated by unit tests
---
### EPIC 3 - Store layer (SQLite + migrations)
- T3.1 Implement migrations runner
- T3.2 Implement core tables and queries (collections, contexts, documents, content, chunks, errors)
- T3.3 Status/health queries
Acceptance:
- integration tests migrate and run CRUD correctly
---
### EPIC 4 - Converter subsystem (Node-only)
- T4.1 MIME detector + tests
- T4.2 Canonical Markdown normalizer + tests
- T4.3 Converter interfaces, registry, error mapping + tests
- T4.4 Native markdown/plaintext converters
- T4.5 markitdown-ts adapter + golden fixtures for pdf/docx/xlsx
- T4.5b officeparser adapter + golden fixtures for pptx
Acceptance:
- conversion fixtures match expected markdown exactly
- no Python dependency
---
### EPIC 5 - Indexing sync (`gno update`) and FTS
- T5.1 File walker + include/exclude logic + deterministic path normalization
- T5.2 Sync algorithm: hash, convert, upsert, soft-delete missing
- T5.3 Chunking (deterministic, tokenizer-aware when available)
- T5.4 FTS5 indexing over chunks + snippet extraction (with optional line ranges)
- T5.5 `gno status` and `gno cleanup`
- T5.6 `index` command wrapper:
- runs `update` then `embed` (unless `--no-embed`)
- supports `--collection` scoping
- supports `--models-pull` (model resolution/download; prompted unless `--yes`)
- supports `--git-pull` (best-effort git pull in git repos)
- supports `--yes`
- T5.7 Language tagging:
- persist document `language_hint`
- deterministic per-chunk language detection and storage
- surface `snippetLanguage` in CLI and MCP outputs
Acceptance:
- `gno update` on fixture corpus yields correct docs/chunks/fts
- conversion errors recorded but do not block overall update
---
### EPIC 6 - LLM subsystem (node-llama-cpp) and model UX
- T6.1 LLM adapter lifecycle management
- T6.2 Model presets and config overrides
- T6.3 Model cache resolver (hf: URIs) and `gno models` commands
- T6.4 `gno doctor` checks for models, vec availability, conversion readiness
Acceptance:
- local models resolvable and cached
- doctor produces actionable diagnostics
---
### EPIC 7 - Vector index and embeddings workflow (`gno embed`)
- T7.1 sqlite-vec integration (optional deps handled cleanly)
- T7.2 Embedding backlog detection
- T7.3 Batch embed chunks and store vectors per model
- T7.4 `--force` re-embed support
Acceptance:
- vectors populated and status shows backlog decreasing to zero
---
### EPIC 8 - Search pipelines
- T8.1 `gno search` (FTS)
- T8.2 `gno vsearch` (vector)
- T8.3 `gno query` hybrid:
- strong BM25 skip expansion
- structured expansion with schema constraint
- parallel retrieval
- RRF fusion + top-rank bonus
- rerank + blended scoring
- `--explain`
- T8.4 `ask` command:
- wrapper over `query` with citations-first output
- optional grounded short answer when generation is configured (gated by `--answer`)
- T8.5 Multilingual awareness:
- deterministic query language detection (heuristics)
- language-aware prompt template selection for expansion and rerank
- cache keys include language tags and template versions
Acceptance:
- deterministic golden tests for `search`
- evalite gates for hybrid behavior and ranking
---
### EPIC 9 - Retrieval and output polish
- T9.1 Output formatters: cli/json/files/csv/md/xml
- T9.2 `get` and `multi-get` including limits and skipped records
- T9.3 `ls` (collections and per-prefix listing)
Acceptance:
- output schemas enforced by tests across formats where applicable
---
### EPIC 10 - MCP server (stdio)
- T10.1 MCP server skeleton
- T10.2 Implement tools: `gno_search`, `gno_vsearch`, `gno_query`, `gno_get`, `gno_multi_get`, `gno_status`
- T10.3 Implement resource reads for `gno://...`
- T10.4 MCP contract tests (golden tool calls)
Acceptance:
- MCP inspector can call tools and read resources consistently
---
### EPIC 11 - Evals and regression gates (Evalite v1)
- T11.1 Curate corpus, queries, judgments
- T11.2 Implement eval harness and metrics
- T11.3 CI gating (soft fail then hard fail after baseline)
Acceptance:
- evals run locally and in CI with documented thresholds
---
### EPIC 12 - Packaging, release, docs
- T12.1 npm packaging and install docs
- T12.1a Quickstart docs:
- `gno init` -> `gno index` -> `gno ask`
- include OS-specific paths and what gets created where
- include multilingual note: default preset is multilingual; language tagging is best effort
- T12.2 release automation, versioning, changelog
- T12.3 troubleshooting: native deps, caches, model downloads
Acceptance:
- clean install path; `gno doctor` makes first-run successful
---
## 23. MVP acceptance criteria
1. User can index:
- a Markdown folder
- a folder containing at least PDF and DOCX (and ideally PPTX/XLSX)
2. `gno search`, `gno vsearch`, `gno query` return results with:
- docid
- `gno://` URI
- source path + MIME (in structured output; always for MCP)
- snippet and optional line range
3. `gno get` returns mirror markdown and can include source metadata.
4. `gno mcp` exposes tools with stable schemas and supports `gno://` reads.
5. Test suite includes:
- converter golden fixtures
- CLI integration tests on a fixture corpus
- evalite harness for ranking quality and expansion stability
6. No Python dependency anywhere.
7. First-run success:
- a new user can run `gno init` -> `gno index` -> `gno ask` and get usable results without editing config files manually
8. Multilingual behavior:
- per-chunk language tags are stored deterministically and surfaced in CLI and MCP outputs (`snippetLanguage`)
- default model preset supports DE/FR/IT/EN, and hybrid retrieval remains usable without manual tuning
---
## 24. Future extensions (explicitly supported by architecture)
- Watch mode: `gno watch` incremental updates (non-MVP)
- Cross-device sync protocol (content-addressed manifests)
- Rich metadata extraction (authors, dates, slide titles, sheet names)
- OS integration: `gno open` / reveal (non-MVP)
- Additional native converters (pdf.js, mammoth, pptx parsing, SheetJS)
---
SkillSprout is an AI-powered microlearning platform designed to help users learn new skills through bite-sized lessons and adaptive quizzes. The platform leverages Azure OpenAI for content generation, Gradio for user interaction, and Model Context Protocol (MCP) for agent interoperability.
This dashboard is a web-based interface built using **Next.js (or Astro)** and hosted on **Vercel**. It acts as the control center for Joey’s stock intelligence, allowing you to:
Gemini Code Flow is an advanced AI-powered development orchestration platform that adapts RuV's Claude Code Flow for Google's Gemini CLI. It enables developers to leverage multiple AI agents working in parallel to write, test, and optimize code using the SPARC methodology.
**Version: 6.0 (FINAL)**