RAG Pipeline Prompts¶
Ten prompts for chunking documents, building retrieval indexes, running persona-aware queries, and configuring the FCC RAG pipeline. Each prompt includes a code snippet and a description of the expected output.
Table of Contents¶
- Chunk a Document with Sentence Strategy
- Chunk with Sliding Window
- Build a Retrieval Index
- Run a Basic Retrieval Query
- Persona-Aware Query
- Build a Full RAG Pipeline
- Compare Chunking Strategies
- Multi-Document Pipeline
- Filtered Retrieval by Metadata
- RAG Pipeline with Scoring
1. Chunk a Document with Sentence Strategy¶
Split a document into sentence-level chunks for fine-grained retrieval.
from fcc.rag.chunker import DocumentChunker
chunker = DocumentChunker(strategy="sentence")
text = """The FCC framework uses a Find-Create-Critique cycle.
Each phase is staffed by specialized personas.
The Find phase gathers research and inventories.
The Create phase produces blueprints and specifications.
The Critique phase validates quality and compliance."""
chunks = chunker.chunk(text, metadata={"source": "overview.md"})
for chunk in chunks:
print(f"[{chunk.chunk_id}] {chunk.text[:60]}...")
Expected output: Five chunks, one per sentence, each with a unique chunk_id, the original text, and the metadata {"source": "overview.md"} propagated to every chunk.
2. Chunk with Sliding Window¶
Use overlapping windows to preserve context across chunk boundaries.
from fcc.rag.chunker import DocumentChunker
chunker = DocumentChunker(strategy="sliding_window", window_size=200, overlap=50)
chunks = chunker.chunk(long_document_text, metadata={"source": "architecture.md"})
print(f"Produced {len(chunks)} overlapping chunks")
print(f"Overlap tokens: {chunker.overlap}")
Expected output: A list of chunks where each chunk contains approximately 200 tokens, with 50 tokens of overlap between consecutive chunks. This ensures that information at chunk boundaries is not lost during retrieval.
3. Build a Retrieval Index¶
Index chunked documents for semantic retrieval.
from fcc.rag.chunker import DocumentChunker
from fcc.rag.retriever import SemanticRetriever
from fcc.search.embeddings import MockEmbeddingProvider
chunker = DocumentChunker(strategy="paragraph")
chunks = chunker.chunk(document_text)
provider = MockEmbeddingProvider(dimension=384)
retriever = SemanticRetriever(embedding_provider=provider)
retriever.index(chunks)
print(f"Indexed {len(chunks)} chunks")
Expected output: A SemanticRetriever instance with all chunks indexed and ready for similarity search. The MockEmbeddingProvider generates deterministic 384-dimensional embeddings for testing.
4. Run a Basic Retrieval Query¶
Search the index for chunks relevant to a natural language query.
results = retriever.query("How does the FCC workflow handle quality gates?", top_k=5)
for result in results:
print(f"Score: {result.score:.3f} | {result.chunk.text[:80]}...")
Expected output: The top 5 most semantically similar chunks to the query, ranked by similarity score (highest first). Each result includes the score, the chunk text, and the chunk metadata.
5. Persona-Aware Query¶
Augment a query with persona context to improve retrieval relevance.
from fcc.rag.pipeline import RAGPipeline
from fcc.personas.registry import PersonaRegistry
from fcc._resources import get_personas_dir
registry = PersonaRegistry.from_yaml_directory(get_personas_dir())
persona = registry.get("GCA") # Governance Compliance Auditor
pipeline = RAGPipeline(retriever=retriever, persona_registry=registry)
results = pipeline.query(
question="What compliance checks are required before deployment?",
persona_id="GCA",
top_k=5
)
for result in results:
print(f"[{result.score:.3f}] {result.chunk.text[:80]}...")
Expected output: Retrieval results biased toward governance and compliance content because the query is augmented with GCA's R.I.S.C.E.A.R. context (role, constraints, expected output). This improves precision for domain-specific questions.
6. Build a Full RAG Pipeline¶
Assemble a complete pipeline from chunker through retriever to persona-aware generation.
from fcc.rag.chunker import DocumentChunker
from fcc.rag.retriever import SemanticRetriever
from fcc.rag.pipeline import RAGPipeline
from fcc.search.embeddings import MockEmbeddingProvider
from fcc.personas.registry import PersonaRegistry
from fcc._resources import get_personas_dir
# Step 1: Chunk
chunker = DocumentChunker(strategy="semantic")
chunks = chunker.chunk(document_text)
# Step 2: Index
provider = MockEmbeddingProvider(dimension=384)
retriever = SemanticRetriever(embedding_provider=provider)
retriever.index(chunks)
# Step 3: Pipeline
registry = PersonaRegistry.from_yaml_directory(get_personas_dir())
pipeline = RAGPipeline(retriever=retriever, persona_registry=registry)
# Step 4: Query
answer = pipeline.query("Explain the constitution tier model", persona_id="DGS")
Expected output: A complete pipeline that chunks documents semantically, indexes them, and answers questions using persona-aware retrieval. The DGS persona's governance expertise shapes which chunks are considered most relevant.
7. Compare Chunking Strategies¶
Evaluate how different chunking strategies affect retrieval quality.
from fcc.rag.chunker import DocumentChunker
strategies = ["sentence", "paragraph", "sliding_window", "semantic", "section", "page"]
for strategy in strategies:
chunker = DocumentChunker(strategy=strategy)
chunks = chunker.chunk(document_text)
avg_len = sum(len(c.text) for c in chunks) / len(chunks) if chunks else 0
print(f"{strategy:20s}: {len(chunks):4d} chunks, avg {avg_len:.0f} chars")
Expected output: A comparison table showing how each of the 6 chunking strategies produces different numbers of chunks with different average lengths. Sentence-level produces many small chunks; page-level produces few large ones. The right strategy depends on your retrieval granularity needs.
8. Multi-Document Pipeline¶
Index documents from multiple sources with metadata for filtered retrieval.
from fcc.rag.chunker import DocumentChunker
from fcc.rag.retriever import SemanticRetriever
from fcc.search.embeddings import MockEmbeddingProvider
chunker = DocumentChunker(strategy="paragraph")
all_chunks = []
documents = [
("architecture.md", "The system uses a microservice architecture..."),
("governance.md", "Constitution tiers define enforcement levels..."),
("deployment.md", "Production deployments require quality gate checks..."),
]
for filename, text in documents:
chunks = chunker.chunk(text, metadata={"source": filename})
all_chunks.extend(chunks)
provider = MockEmbeddingProvider(dimension=384)
retriever = SemanticRetriever(embedding_provider=provider)
retriever.index(all_chunks)
print(f"Indexed {len(all_chunks)} chunks from {len(documents)} documents")
Expected output: A retriever with chunks from all three documents, each tagged with its source file. This enables source-filtered queries later.
9. Filtered Retrieval by Metadata¶
Restrict retrieval to chunks from specific sources or with specific tags.
results = retriever.query(
"quality gate requirements",
top_k=5,
filter_metadata={"source": "governance.md"}
)
for result in results:
print(f"[{result.chunk.metadata['source']}] {result.chunk.text[:60]}...")
Expected output: Only chunks from governance.md are returned, even if chunks from other documents would have higher similarity scores. Metadata filtering narrows the search space before ranking.
10. RAG Pipeline with Scoring¶
Evaluate the quality of retrieved results using the FCC scoring engine.
from fcc.rag.pipeline import RAGPipeline
from fcc.collaboration.scoring import ScoringEngine
pipeline = RAGPipeline(retriever=retriever, persona_registry=registry)
results = pipeline.query("How are constitutions enforced?", persona_id="GCA", top_k=5)
scorer = ScoringEngine()
for result in results:
score = scorer.score_retrieval(
query="How are constitutions enforced?",
retrieved_text=result.chunk.text,
persona_id="GCA"
)
print(f"Relevance: {score.relevance:.2f} | Coverage: {score.coverage:.2f} | {result.chunk.text[:50]}...")
Expected output: Each retrieved chunk scored on relevance (how well it matches the query) and coverage (how completely it addresses the persona's expected output requirements). This helps identify gaps in your document corpus.