Skip to content

RAG Customization Guide

Advanced Tutorial

Duration: 40-50 minutes | Level: Advanced | Prerequisites: RAG Pipeline tutorial

Customize the FCC RAG pipeline with 6 chunking strategies, embedding providers, and persona-aware query tuning.

Chunking Strategies

The DocumentChunker supports 6 strategies for splitting documents into retrievable chunks.

Strategy 1: Fixed-Size Chunks

from fcc.rag.chunking import DocumentChunker

chunker = DocumentChunker(strategy="fixed", chunk_size=500, overlap=50)
chunks = chunker.chunk_text("Your long document text here...")
print(f"Chunks: {len(chunks)}, avg size: {sum(len(c) for c in chunks) / len(chunks):.0f}")

Best for: Uniform documents, simple retrieval, baseline performance.

Strategy 2: Sentence-Based Chunks

chunker = DocumentChunker(strategy="sentence", max_sentences=5)
chunks = chunker.chunk_text(document_text)

Best for: Natural language documents, FAQ-style content.

Strategy 3: Paragraph-Based Chunks

chunker = DocumentChunker(strategy="paragraph")
chunks = chunker.chunk_text(document_text)

Best for: Well-structured documents with clear paragraph breaks.

Strategy 4: Semantic Chunks

chunker = DocumentChunker(strategy="semantic", similarity_threshold=0.7)
chunks = chunker.chunk_text(document_text)

Best for: Documents with topic shifts, maximizing coherence per chunk.

Strategy 5: Markdown-Aware Chunks

chunker = DocumentChunker(strategy="markdown")
chunks = chunker.chunk_text(markdown_text)

Best for: Markdown documentation, preserving header structure.

Strategy 6: Code-Aware Chunks

chunker = DocumentChunker(strategy="code")
chunks = chunker.chunk_text(source_code)

Best for: Source code, preserving function/class boundaries.


Embedding Providers

Mock Provider (Default)

Deterministic hash-based embeddings for testing:

from fcc.search.embeddings import MockEmbeddingProvider

provider = MockEmbeddingProvider()  # 384-dim vectors
embedding = provider.embed("test query")
print(f"Dimensions: {len(embedding)}")

Sentence Transformers

Production-quality embeddings:

from fcc.search.embeddings import SentenceTransformerProvider

provider = SentenceTransformerProvider(model_name="all-MiniLM-L6-v2")
embedding = provider.embed("test query")
print(f"Dimensions: {len(embedding)}")

Persona-Aware Queries

The RAG pipeline supports persona context for improved retrieval.

from fcc.rag.pipeline import RAGPipeline

pipeline = RAGPipeline()

# Query with persona context
results = pipeline.query(
    question="How do I validate a blueprint?",
    persona_id="BC",  # Blueprint Crafter context
    k=5,
)

for result in results:
    print(f"  Score: {result.score:.3f}{result.text[:80]}")

Tuning Tips

  1. Chunk size — Smaller chunks (200-500 chars) for precise retrieval, larger (500-1000) for context
  2. Overlap — 10-20% overlap prevents losing context at boundaries
  3. Top-K — Start with k=5, increase if recall is low
  4. Persona context — Adds domain-specific bias to similarity scoring
  5. Embedding model — MiniLM-L6 for speed, larger models for accuracy