Distiller Bridge -- Phase 15 Addendum¶
This addendum extends the Distiller Bridge Demo and Phase 14 Addendum with Phase 15 unified vocabulary distillation capabilities that leverage the object model abstraction layer and federated knowledge graphs.
Unified Vocabulary Distillation¶
Overview¶
Phase 15 introduces unified vocabulary distillation -- a process that
takes raw terminology from all ecosystem projects and produces a single,
normalized vocabulary using the VocabularyMapping infrastructure. The
Distiller bridge now supports bi-directional mapping between project-local
terms and the unified FCC vocabulary.
Architecture¶
Project A Terms ─┐
Project B Terms ──┼── VocabularyMapping ── Unified FCC Vocabulary
Project C Terms ──┘ │
├── ModelFacade cross-model search
├── FederatedKG edge normalization
└── Compliance report term alignment
Distiller Integration Points¶
Vocabulary Extraction¶
The Distiller bridge extracts raw vocabulary from each project's
domain model using the ModelFacade.stats() API:
from fcc.objectmodel.facade import ModelFacade
facade = get_project_facade("distiller")
stats = facade.stats()
print(f"Terms: {stats['total_terms']}")
print(f"Mapped: {stats['mapped_terms']}")
Term Normalization Pipeline¶
The normalization pipeline applies three stages:
| Stage | Description | Example |
|---|---|---|
| Tokenization | Split compound terms | data_engineer -> data, engineer |
| Synonym Resolution | Map synonyms to canonical form | ML -> machine_learning |
| Namespace Qualification | Add project prefix | architect -> fcc:architect |
Mapping Confidence¶
Each vocabulary mapping carries a confidence score:
| Range | Meaning | Action |
|---|---|---|
| 0.9 -- 1.0 | Exact match | Auto-apply |
| 0.7 -- 0.9 | High confidence | Review recommended |
| 0.5 -- 0.7 | Partial match | Manual review required |
| < 0.5 | Low confidence | Flagged for human review |
Cross-Project Distillation¶
Federated Term Graph¶
The Distiller bridge now builds a federated term graph that connects vocabulary terms across project namespaces:
from fcc.knowledge.federation import FederatedKnowledgeGraph
fkg = FederatedKnowledgeGraph()
fkg.add_namespace("distiller", distiller_graph)
fkg.add_namespace("constel", constel_graph)
# Cross-namespace term edges are automatically resolved
cross_edges = fkg.cross_namespace_edges()
print(f"Cross-namespace term links: {len(cross_edges)}")
Vocabulary Coverage Report¶
A summary report shows distillation coverage:
| Metric | Value |
|---|---|
| Total unique terms | 2,400+ |
| Mapped to unified vocabulary | 2,100+ |
| Unmapped (pending review) | 300 |
| Cross-project synonyms resolved | 180 |
Event Integration¶
Vocabulary distillation emits events through the EventBus:
| Event Type | Payload |
|---|---|
vocabulary.extraction.started |
project, term_count |
vocabulary.mapping.created |
source_term, target_term, confidence |
vocabulary.distillation.completed |
total_mapped, total_unmapped |
Tips¶
- Run vocabulary distillation after any project model update
- Use the confidence threshold to control auto-apply behaviour
- Review the federated term graph for synonym clusters that may indicate vocabulary drift across projects
Related¶
- Distiller Bridge Demo -- Base demo
- Distiller Bridge Phase 14 Addendum -- Evaluation
- Unified Object Model Demo -- Facade operations
- Ecosystem Co-Evolution Demo -- Federation
- Federation Demo -- Cross-project resolution