Skip to content

Distiller Bridge Demo

This demo walks through the FCC Distiller Bridge -- an adapter layer that connects the FCC persona framework to the Fornax ecosystem's NanoCube analytical engine and data fabrication services.


Table of Contents

  1. Introduction
  2. Prerequisites
  3. NanoCube Query Patterns
  4. Data Fabrication Workflows
  5. Cache Behavior
  6. Persona-Driven Analysis
  7. Exchange Contract Templates
  8. Fornax Connection

Introduction

The Distiller Bridge (src/fcc/protocols/distiller_bridge.py) provides a clean interface between FCC's persona-driven analysis and the Fornax ecosystem's data services. It has three components:

Component Purpose
NanoCubeQueryAdapter Hierarchical multi-dimensional data queries
FabricationBridgeAdapter Synthetic data table generation
DistillerBridgeAdapter High-level facade combining both adapters

The bridge operates in two modes:

  • Live mode -- When distiller_ext is installed, queries are forwarded to the live NanoCube engine.
  • Mock mode -- When distiller_ext is not available, deterministic mock results are returned. This is the default for development and testing.

Prerequisites

  • Python 3.10+ with FCC installed: pip install -e ".[dev]"
  • No distiller_ext package needed -- mock mode works out of the box

Check availability:

from fcc.protocols.distiller_bridge import distiller_available

print(f"Distiller available: {distiller_available()}")
# Distiller available: False  (mock mode)

NanoCube Query Patterns

The NanoCubeQuery dataclass defines a query against hierarchical multi-dimensional data.

Basic Query: Count by Dimension

The simplest query counts records along a single dimension:

from fcc.protocols.distiller_bridge import (
    NanoCubeQuery,
    NanoCubeQueryAdapter,
)

adapter = NanoCubeQueryAdapter()

query = NanoCubeQuery(
    dimensions=("category",),
    aggregation="count",
)
result = adapter.query(query)

print(f"Dimensions used: {result.dimensions_used}")
print(f"Total count: {result.total_count}")
print(f"Execution time: {result.execution_time_ms:.1f}ms")
for row in result.data:
    print(row)

In mock mode, this returns:

Dimensions used: ('category',)
Total count: 1
Execution time: 0.1ms
{'category': 'mock_category_value', 'count': 42}

Filtered Query

Add a filters dictionary to restrict the data before aggregation:

query = NanoCubeQuery(
    dimensions=("category",),
    filters={"phase": "find"},
    aggregation="count",
)
result = adapter.query(query)
print(f"Rows: {len(result.data)}")

Multi-Dimension Query

Query across multiple dimensions for cross-tabulation:

query = NanoCubeQuery(
    dimensions=("category", "phase"),
    aggregation="count",
    limit=50,
)
result = adapter.query(query)
print(f"Dimensions: {result.dimensions_used}")
# ('category', 'phase')

for row in result.data:
    print(f"  {row}")
# {'category': 'mock_category_value', 'phase': 'mock_phase_value', 'count': 42}

Aggregation Functions

The aggregation field supports: count, sum, avg, min, max.

query = NanoCubeQuery(
    dimensions=("category",),
    aggregation="sum",
    limit=20,
)
result = adapter.query(query)

Result Structure

Every query returns a NanoCubeResult:

from fcc.protocols.distiller_bridge import NanoCubeResult

# NanoCubeResult fields:
#   query: NanoCubeQuery          -- the original query
#   data: tuple[dict[str, Any]]   -- result rows
#   total_count: int              -- total matching rows (before limit)
#   dimensions_used: tuple[str]   -- dimensions actually used
#   execution_time_ms: float      -- query time in milliseconds

Serialization

Both queries and results support round-trip serialization:

query_dict = query.to_dict()
restored_query = NanoCubeQuery.from_dict(query_dict)

result_dict = result.to_dict()
restored_result = NanoCubeResult.from_dict(result_dict)

Data Fabrication Workflows

The FabricationBridgeAdapter generates synthetic data tables.

FabricationRequest Lifecycle

  1. Create request -- Define the table schema and constraints.
  2. Submit request -- The adapter validates and executes (or mocks).
  3. Receive result -- Inspect status, columns, and preview data.
from fcc.protocols.distiller_bridge import (
    FabricationRequest,
    FabricationBridgeAdapter,
)

adapter = FabricationBridgeAdapter()

# Step 1: Create request
request = FabricationRequest(
    request_id="fab-001",
    table_name="persona_dimensions",
    schema={
        "persona_id": "string",
        "dimension": "string",
        "score": "float",
    },
    row_count=200,
    constraints={"score_min": 0, "score_max": 100},
    exchange_contract_version="2.1.0",
)

# Step 2: Submit
result = adapter.request_tables(request)

# Step 3: Inspect result
print(f"Status: {result.status}")
print(f"Columns: {result.columns}")
print(f"Rows: {result.row_count}")
print(f"Preview: {result.preview_data}")

In mock mode:

Status: completed
Columns: ('persona_id', 'dimension', 'score')
Rows: 200
Preview: ({'persona_id': 'mock_persona_id', 'dimension': 'mock_dimension', 'score': 'mock_score'},)

Table Queries

After fabrication, query the table with SQL-like expressions:

result = adapter.request_queries(
    table_name="persona_dimensions",
    query="SELECT persona_id, AVG(score) FROM persona_dimensions GROUP BY persona_id",
)
print(f"Status: {result['status']}")
# In mock mode: 'mock'

Tracking Requests

The adapter maintains a log of all requests and results:

print(f"Submitted requests: {len(adapter.list_requests())}")
print(f"Completed results: {len(adapter.list_results())}")

# Look up a specific result
status = adapter.get_status("fab-001")
print(f"Request fab-001 status: {status.status if status else 'not found'}")

Result Status Values

Status Meaning
completed Fabrication succeeded, data available
pending Fabrication in progress (live mode only)
failed Fabrication failed, check error field

Cache Behavior

The NanoCubeQueryAdapter caches results to avoid redundant queries.

Cache Key Computation

Cache keys are computed from the query dimensions, aggregation, limit, and sorted filters:

# Internal key format:
# "dim1,dim2|aggregation|limit|[(filter_key, filter_value), ...]"

adapter = NanoCubeQueryAdapter()

# First query -- cache miss, executes (mock) query
result1 = adapter.query(NanoCubeQuery(dimensions=("category",)))
print(f"Cache size: {adapter.cache_size()}")  # 1

# Second identical query -- cache hit, no execution
result2 = adapter.query(NanoCubeQuery(dimensions=("category",)))
print(f"Cache size: {adapter.cache_size()}")  # 1 (same key)
print(result1 is result2)                      # True

# Different query -- cache miss
result3 = adapter.query(NanoCubeQuery(dimensions=("phase",)))
print(f"Cache size: {adapter.cache_size()}")  # 2

Clearing the Cache

adapter.clear_cache()
print(f"Cache size after clear: {adapter.cache_size()}")  # 0

Persona-Driven Analysis

The Distiller Bridge connects naturally to FCC personas. The NCA (Numeric Compliance Analyst) persona drives NanoCube queries for compliance analysis.

NCA Persona to NanoCubeQuery Mapping

from fcc.protocols.distiller_bridge import (
    DistillerBridgeAdapter,
    NanoCubeQuery,
)

bridge = DistillerBridgeAdapter()

# NCA persona query: compliance scores by category
compliance_query = NanoCubeQuery(
    dimensions=("category", "compliance_level"),
    filters={"gate_type": "governance"},
    aggregation="count",
)
result = bridge.query_nanocube(compliance_query)

print(f"Bridge status: {bridge.status()}")
# {
#     'distiller_available': False,
#     'nanocube_cache_size': 1,
#     'fabrication_requests': 0,
#     'fabrication_results': 0,
#     'exchange_contract_version': '2.1.0',
# }

NanoCubeExplorer

High-Level Facade

The DistillerBridgeAdapter provides a unified interface:

bridge = DistillerBridgeAdapter()

# Check availability
print(f"Live Distiller: {bridge.is_available}")

# Access sub-adapters
nanocube = bridge.nanocube      # NanoCubeQueryAdapter
fabrication = bridge.fabrication  # FabricationBridgeAdapter

# Convenience methods
result = bridge.query_nanocube(query)
fab_result = bridge.request_fabrication(request)

# Status summary
print(bridge.status())

Exchange Contract Templates

FCC includes exchange contract templates that define the data contracts between services in the Fornax ecosystem.

Loading Templates

import yaml
from fcc._resources import get_governance_dir

gov_dir = get_governance_dir()
template_path = gov_dir / "exchange_contract_template.yaml"

with open(template_path) as f:
    contract = yaml.safe_load(f)

print(f"Contract version: {contract.get('version', 'unknown')}")

Contract Structure

Exchange contracts define:

  • Schema -- Column names, types, and constraints
  • SLAs -- Latency, throughput, and availability guarantees
  • Versioning -- Contract version (currently 2.1.0)
  • Validation rules -- Data quality checks at ingestion

These contracts are referenced by the FabricationRequest.exchange_contract_version field.


Fornax Connection

The Distiller Bridge is one part of the larger Fornax ecosystem integration. When distiller_ext is installed:

  1. NanoCube queries are forwarded to distiller_ext.nanocube_query().
  2. Fabrication requests are forwarded to distiller_ext.fabricate_table().
  3. Table queries are forwarded to distiller_ext.query_table().

The mock mode mirrors the same API surface, so code written against mock mode works identically with the live engine. No conditional logic is needed in application code -- the adapter handles the mode selection internally.

Integration Architecture

FCC Personas
    |
    v
DistillerBridgeAdapter (facade)
    |
    +-- NanoCubeQueryAdapter ------> distiller_ext.nanocube_query()
    |
    +-- FabricationBridgeAdapter --> distiller_ext.fabricate_table()
                                     distiller_ext.query_table()

See Also

  • Visualization Cookbook -- using NanoCube data in D3 visualizations
  • Open Science Demo -- compliance analysis using bridge data
  • Messaging Realtime Demo -- event-driven bridge integration
  • src/fcc/protocols/distiller_bridge.py -- full source
  • src/fcc/data/governance/exchange_contract_template.yaml -- contract template