README.md Raw

Secure Hybrid RAG MCP Capability Buildprint

This Capability Buildprint packages a strict workflow for adding secure, measurable Retrieval-Augmented Generation to an existing host app.

It is designed for coding agents. It is not a vector database checklist and not a generic “add embeddings” prompt.

What it adds

document ingestion with receipts
structured parsing and chunking
ACL metadata on documents and chunks
pre-retrieval authorization
dense vector retrieval
keyword/full-text retrieval
deterministic hybrid fusion
optional reranking behind a feature flag
cited evidence responses
structured generation outputs
RAG evaluation fixtures and metrics
retrieval receipts and observability
reindex/delete lifecycle controls
MCP-compatible tools or equivalent host service methods

What the host app must already have

authenticated user or service-principal identity
tenant, team, project, customer, account, document, or equivalent ownership boundaries
an authorization source for deciding which subject may read which document
durable storage for document metadata, chunks, indexes, and receipts
a file upload, import, or document source path
background job or worker capacity for ingestion and indexing
local validation commands that can prove allow and deny behavior

If the host cannot resolve identity, permissions, or ACL assignment, this capability must stop before implementation.

Execution profile

strict

Secure RAG touches private documents, auth boundaries, derived indexes, model context, logs, deletion behavior, and sometimes offer/pricing/legal outputs. The applying agent must assess the host, run the assessment question gate, write a security-aware integration plan, implement through phases, verify negative and positive paths, and write a receipt.

Security invariant

Access control happens before retrieval.

The capability is invalid if it searches the whole corpus and filters forbidden chunks afterward. Dense vector search, keyword search, fusion, reranking, citation, generation, logs, and evaluation fixtures must all operate only on the authorized corpus.

In practical terms:

resolve subject -> compute allowed corpus -> dense search + keyword search -> fuse -> rerank -> cite -> generate

Not:

search everything -> remove forbidden chunks later

Preferred baseline stack

Use equivalent host systems when they already exist. Otherwise the default MVP path is:

parser: Docling
data store: PostgreSQL
vector search: pgvector
keyword search: PostgreSQL full-text search
fusion: Reciprocal Rank Fusion
reranker: optional provider or local reranker behind a feature flag
evaluation: golden set plus context precision, context recall, faithfulness, answer relevance, field accuracy, and permission-leak checks

The durable contract is the architecture and proof behavior, not a specific vendor.

Agent flow

flowchart TD
  A[Read BUILDPRINT.md] --> B[Read capability.yaml]
  B --> C[Check compatibility.md]
  C --> D[Write .buildprint/host-assessment.md]
  D --> E[Run 00-assessment-questions.md]
  E --> F[Write security and integration plans]
  F --> G[Define contracts and config]
  G --> H[Ingest, parse, chunk, ACL, index]
  H --> I[Authorized hybrid retrieval]
  I --> J[Cited generation and MCP/host surfaces]
  J --> K[Evaluation, observability, lifecycle proof]
  K --> L[Write .buildprint/capability-receipt.md]

Discovery question gate

Capability questions happen after host assessment. The agent should inspect the repo first, then ask only what blocks safe integration.

Hard-stop questions include:

Which identity source is authoritative?
Which tenant/project/customer/document boundaries apply?
Which permission source controls document access?
Can ACL metadata be assigned at ingestion time?
Can dense and keyword retrieval share the same authorization pre-filter?
Are external parsers, embedding models, rerankers, or generation providers approved for private documents?
What must deletion remove: raw files, chunks, embeddings, keyword index rows, cached summaries, receipts?

Hard-stop answers cannot be guessed. They must be confirmed, explicitly delegated, or recorded as blockers.

Proof levels

flowchart LR
  A[structure] --> B[fixture]
  B --> C[runtime]
  C --> D[production]

The first useful proof level is fixture: one allowed retrieval, one denied retrieval, cited output, and a small golden set. A stronger runtime proof exercises the actual host route, worker, database, and MCP/API surface.

Non-negotiables

No source edits before host assessment, assessment questions, and capability plan.
No chunks indexed without ACL metadata.
No post-retrieval filtering as the security boundary.
No separate auth logic for vector and keyword paths.
No uncited generated answers for evidence-backed tasks.
No offer, price, quantity, legal, or safety claim without evidence or uncertainty.
No raw sensitive chunk logging by default.
No success claim without allowed and denied retrieval proof.
No deletion claim unless derived indexes and caches are removed or invalidated.

Where to start

Start with BUILDPRINT.md. The README is a human overview; the Buildprint files are the executable contract.