Distiller Bridge -- v1.3.5.2 Addendum¶
This addendum extends the Distiller Bridge Demo,
the Phase 14 Addendum, and the
Phase 15 Addendum with the
v1.3.5.2 NanoCube vocabulary-evolution scenario. The scenario
demonstrates what happens when a sister project -- distiller_ex
(codename Fornax) -- ships a schema update and the FCC side has to
detect, reconcile, and audit the change.
Scenario Overview¶
The scenario has four phases:
| Phase | Activity | Artefact Changed |
|---|---|---|
| 1. Detect | FCC loads the plugin and compares class map against YAML | In-memory diff report |
| 2. Reconcile | Operator updates the packaged YAML mapping | src/fcc/data/objectmodel/distiller_vocabulary_mappings.yaml |
| 3. Emit | Bus publishes vocabulary.mismatch events |
EventBus log, ComplianceSubscriber audit record |
| 4. Verify | Re-run loader; coverage returns to 100% | Diff report empty |
The Vocabulary Contract¶
Sister projects do not ship runtime imports into FCC. Instead they
expose a VocabularyProviderPlugin that declares the entity classes
they own and the string IDs FCC should map to them. The plugin is the
single point of truth at runtime; the packaged YAML under
src/fcc/data/objectmodel/ is the persistent, reviewable contract.
# Abbreviated plugin contract -- see src/fcc/plugins/base.py
class VocabularyProviderPlugin(ABC):
@abstractmethod
def get_class_map(self) -> dict[str, type]:
"""Return {local_id: class} for every entity this project owns."""
Fornax registers a concrete implementation -- DistillerVocabProvider
-- that returns roughly 175 entries covering NanoCube, Slice,
Dimension, Measure, and a number of derived aggregation classes.
Phase 1 -- Detect¶
Loading the Current Mapping¶
The VocabularyMappingLoader reads the packaged YAML and then diffs
the live plugin class map against it. The diff is a simple frozen
dataclass that lists additions (IDs in the plugin but not in the YAML)
and removals (IDs in the YAML but not in the plugin).
from fcc.objectmodel.vocabulary_loader import VocabularyMappingLoader
from fcc.plugins import load_plugin
loader = VocabularyMappingLoader()
store = loader.load_project("distiller") # reads distiller_vocabulary_mappings.yaml
plugin = load_plugin("distiller_ex.DistillerVocabProvider")
result = loader.verify_against_plugin(store, plugin)
print("Missing from YAML:", result.additions) # Plugin has it, YAML doesn't
print("Stale in YAML:", result.removals) # YAML has it, plugin doesn't
print("Coverage:", result.coverage_ratio) # 0.0 -- 1.0
A Typical Fornax Update¶
A realistic NanoCube schema update might touch 3-12 IDs per release. The v1.3.5.2 reference run uses a synthetic update that adds two new classes and removes one deprecated class:
| Change | Local ID | Python Class | Reason |
|---|---|---|---|
| Addition | nano_slice_sparse |
distiller_ex.models.SparseSlice |
New sparse representation |
| Addition | nano_measure_percentile |
distiller_ex.models.PercentileMeasure |
New aggregation type |
| Removal | nano_legacy_cube |
(removed from plugin) | Deprecated since Fornax v0.9 |
Detect-Time Coverage¶
Before reconciliation, the loader reports:
| Metric | Value |
|---|---|
| Plugin class map size | 176 |
| YAML mapping entries | 175 |
| Additions detected | 2 |
| Removals detected | 1 |
| Coverage ratio | 0.983 |
Phase 2 -- Reconcile¶
Editing the YAML¶
Reconciliation is a direct edit of the packaged YAML. Each entry in the file is a small block with five fields; additions are appended to the relevant section and removals are deleted.
# src/fcc/data/objectmodel/distiller_vocabulary_mappings.yaml
mappings:
- local_id: nano_slice_sparse
class_path: distiller_ex.models.SparseSlice
category: slice
confidence: 1.00
source: distiller_ex@1.4.0
- local_id: nano_measure_percentile
class_path: distiller_ex.models.PercentileMeasure
category: measure
confidence: 1.00
source: distiller_ex@1.4.0
# (the nano_legacy_cube entry is deleted)
Re-verifying¶
Running verify_against_plugin after the edit should produce an empty
diff and a coverage ratio of 1.0. If it does not, the most common
cause is a typo in class_path -- the loader imports the class lazily
so a typo is only caught on verification, not on YAML parse.
Reconciliation Checklist¶
| Step | Check |
|---|---|
| 1 | New IDs appended with confidence: 1.00 |
| 2 | source tag updated to reflect new sister-project version |
| 3 | Removed IDs fully deleted (no commented-out entries) |
| 4 | Categories match the existing taxonomy |
| 5 | make test green, make lint green |
Phase 3 -- Emit¶
Mismatch Events¶
Every diff surfaced in Phase 1 produces a vocabulary.mismatch event
on the bus. The event carries enough context for a downstream auditor
to reconstruct what changed without re-reading the YAML.
| Event Field | Example |
|---|---|
event_type |
vocabulary.mismatch |
source |
vocabulary_loader |
payload.project |
distiller |
payload.additions |
["nano_slice_sparse", "nano_measure_percentile"] |
payload.removals |
["nano_legacy_cube"] |
payload.coverage_before |
0.983 |
payload.plugin_source |
distiller_ex@1.4.0 |
ComplianceSubscriber Handling¶
The compliance subscriber (src/fcc/compliance/subscriber.py) listens
for vocabulary.mismatch and performs two actions:
- Record an audit-log entry with the full payload and a UTC timestamp.
- Schedule a re-audit of any compliance requirement that references the affected entity classes.
The audit log is appended to the evidence graph so that a subsequent compliance report can reference the vocabulary change as provenance for why a re-audit was triggered.
from fcc.compliance.subscriber import ComplianceSubscriber
from fcc.messaging.bus import EventBus
bus = EventBus.default()
subscriber = ComplianceSubscriber()
bus.subscribe("vocabulary.mismatch", subscriber.handle)
Replay and Traceability¶
Because events are captured by the standard EventSerializer, the
entire reconciliation session can be replayed from a JSON log. This
is the recommended way to test the subscriber in isolation:
from fcc.messaging.serialization import EventReplay
replay = EventReplay.from_json_file("./tests/fixtures/vocab_mismatch_run.json")
replay.into_bus(bus) # subscribers receive the same sequence
Phase 4 -- Verify¶
Before / After Coverage¶
After applying the YAML edits from Phase 2 and re-running the loader, coverage returns to 100%.
| Metric | Before | After |
|---|---|---|
| Plugin class map size | 176 | 176 |
| YAML mapping entries | 175 | 176 |
| Additions detected | 2 | 0 |
| Removals detected | 1 | 0 |
| Coverage ratio | 0.983 | 1.000 |
vocabulary.mismatch events emitted |
3 | 0 |
Audit Evidence¶
The compliance evidence graph now contains a node for each recorded mismatch event with edges back to:
- The persona categories that reference the affected entity classes.
- The compliance requirements scheduled for re-audit.
- The YAML commit SHA (if
FCC_EVIDENCE_GIT=1is set) that closed the mismatch.
Operational Guidance¶
When to Run the Loader¶
The recommended cadence is:
| Trigger | Action |
|---|---|
| Sister-project release tag | Full verify + reconcile cycle |
| FCC pre-merge CI | make verify-vocabularies target (fail on diff) |
| Pre-release gate | Regenerate evidence graph |
| Manual audit | fcc vocabulary verify --project distiller |
CI Integration¶
A dedicated target runs the verification headlessly in CI. Failure modes are explicit and reportable.
make verify-vocabularies
# Runs loader.verify_against_plugin across all 12 sister projects.
# Exits non-zero if any project reports a non-empty diff.
Troubleshooting¶
| Symptom | Probable Cause | Next Step |
|---|---|---|
Loader import error on class_path |
Typo or sister project not installed | pip install -e path/to/distiller_ex and re-run |
| Coverage ratio below 1.0 after edit | YAML still missing a new ID | Compare result.additions with diff and append missing block |
| Compliance subscriber not triggered | Subscriber not registered on the bus | Ensure ComplianceSubscriber is subscribed to vocabulary.mismatch |
| Evidence graph stale | Re-audit not scheduled | Explicitly call CompliancePipeline.run_affected() |
| Recurring removals across releases | Sister project deprecated without a migration note | Coordinate with sister-project owner before editing YAML |
Tips¶
- Treat the YAML as a reviewable contract. A pull request that edits a vocabulary mapping should link to the upstream sister-project commit that drove the change.
- Prefer atomic commits: one PR per sister-project version bump. Mixed reconciliations are difficult to audit later.
- Use
confidence: 1.00only for direct class-name matches. Any inferred or fuzzy mapping should start below 0.90 and be manually reviewed.
See also¶
- Distiller Bridge Demo -- Base walkthrough
- Distiller Bridge Phase 15 Addendum -- Unified vocabulary distillation
- Federated vs Individual Architecture -- Federation topology
- Vocabulary Provider Load Sequence Diagram -- Load-time sequence
- Full-Stack Ecosystem v1.3.5.2 Addendum -- End-to-end wiring