RAG Customization Guide¶
Advanced Tutorial
Duration: 40-50 minutes | Level: Advanced | Prerequisites: RAG Pipeline tutorial
Customize the FCC RAG pipeline with 6 chunking strategies, embedding providers, and persona-aware query tuning.
Chunking Strategies¶
The DocumentChunker supports 6 strategies for splitting documents into retrievable chunks.
Strategy 1: Fixed-Size Chunks¶
from fcc.rag.chunking import DocumentChunker
chunker = DocumentChunker(strategy="fixed", chunk_size=500, overlap=50)
chunks = chunker.chunk_text("Your long document text here...")
print(f"Chunks: {len(chunks)}, avg size: {sum(len(c) for c in chunks) / len(chunks):.0f}")
Best for: Uniform documents, simple retrieval, baseline performance.
Strategy 2: Sentence-Based Chunks¶
chunker = DocumentChunker(strategy="sentence", max_sentences=5)
chunks = chunker.chunk_text(document_text)
Best for: Natural language documents, FAQ-style content.
Strategy 3: Paragraph-Based Chunks¶
Best for: Well-structured documents with clear paragraph breaks.
Strategy 4: Semantic Chunks¶
chunker = DocumentChunker(strategy="semantic", similarity_threshold=0.7)
chunks = chunker.chunk_text(document_text)
Best for: Documents with topic shifts, maximizing coherence per chunk.
Strategy 5: Markdown-Aware Chunks¶
Best for: Markdown documentation, preserving header structure.
Strategy 6: Code-Aware Chunks¶
Best for: Source code, preserving function/class boundaries.
Embedding Providers¶
Mock Provider (Default)¶
Deterministic hash-based embeddings for testing:
from fcc.search.embeddings import MockEmbeddingProvider
provider = MockEmbeddingProvider() # 384-dim vectors
embedding = provider.embed("test query")
print(f"Dimensions: {len(embedding)}")
Sentence Transformers¶
Production-quality embeddings:
from fcc.search.embeddings import SentenceTransformerProvider
provider = SentenceTransformerProvider(model_name="all-MiniLM-L6-v2")
embedding = provider.embed("test query")
print(f"Dimensions: {len(embedding)}")
Persona-Aware Queries¶
The RAG pipeline supports persona context for improved retrieval.
from fcc.rag.pipeline import RAGPipeline
pipeline = RAGPipeline()
# Query with persona context
results = pipeline.query(
question="How do I validate a blueprint?",
persona_id="BC", # Blueprint Crafter context
k=5,
)
for result in results:
print(f" Score: {result.score:.3f} — {result.text[:80]}")
Tuning Tips¶
- Chunk size — Smaller chunks (200-500 chars) for precise retrieval, larger (500-1000) for context
- Overlap — 10-20% overlap prevents losing context at boundaries
- Top-K — Start with k=5, increase if recall is low
- Persona context — Adds domain-specific bias to similarity scoring
- Embedding model — MiniLM-L6 for speed, larger models for accuracy