Vocabulary Provider Load¶
This diagram traces how a sister project's vocabulary (for example athenium or mnemosyne) is discovered, validated, and registered against FCC's canonical object model. The entry point is VocabularyMappingLoader.load(namespace) in src/fcc/objectmodel/vocabulary_loader.py, which is invoked at startup or on-demand when a federated query needs a namespace that has not yet been indexed. Developers read this trace to understand the VocabularyProviderPlugin contract introduced in v1.2.1 — the pattern by which sister repositories contribute their class mappings without creating runtime import cycles. The 175 packaged YAML mappings under src/fcc/data/objectmodel/ are the canonical source; plugins either wrap those or contribute their own.
The sequence below shows discovery of provider plugins, validation of each mapping, and the success and mismatch branches.
sequenceDiagram
participant Caller
participant VocabularyMappingLoader
participant PluginRegistry
participant Plugin as VocabularyProviderPlugin
participant VocabularyMapping
participant MappingStore
participant Logger
participant EventBus
Caller->>VocabularyMappingLoader: load(namespace)
VocabularyMappingLoader->>PluginRegistry: discover(VOCABULARY_PROVIDERS)
PluginRegistry-->>VocabularyMappingLoader: list[VocabularyProviderPlugin]
loop for each plugin
VocabularyMappingLoader->>Plugin: get_namespace()
Plugin-->>VocabularyMappingLoader: namespace_id
alt namespace matches
VocabularyMappingLoader->>Plugin: get_class_map()
Plugin-->>VocabularyMappingLoader: dict[str, type]
loop for each entry
VocabularyMappingLoader->>VocabularyMapping: validate(entry)
alt valid
VocabularyMapping-->>VocabularyMappingLoader: ok
VocabularyMappingLoader->>MappingStore: add(mapping)
else missing source_id
VocabularyMappingLoader->>Logger: info(missing_source_ids)
Note over EventBus: emits vocabulary.mismatch
VocabularyMappingLoader->>EventBus: publish(vocabulary.mismatch)
end
end
Note over EventBus: emits vocabulary.loaded
VocabularyMappingLoader->>EventBus: publish(vocabulary.loaded)
end
end
VocabularyMappingLoader-->>Caller: dict[str, type]
Failure modes are designed to be soft. A plugin that raises from get_class_map() is caught by the loader, logged, and skipped — other plugins still load. Individual rows missing a source_id, target_id, or similarity_score below the configured floor are collected into a missing_source_ids report written at INFO level and emitted as vocabulary.mismatch events; callers typically aggregate these for a weekly data-quality digest. The loader never raises on partial failure, which is important because federation queries depend on best-effort availability of cross-project vocabularies. Instrumentation usually subscribes to both events: vocabulary.loaded for successful namespace registration metrics and vocabulary.mismatch for provenance alerts.
The returned dict[str, type] is the live class map used by ModelFacade implementations to construct DomainEntity instances for the namespace; it is keyed by the canonical FCC class name, not the source namespace's name, so that federated queries resolve uniformly across ecosystems. In practice the map is small (tens of entries per provider) and eagerly constructed.
Steps in detail¶
- Caller to VocabularyMappingLoader: load — The caller passes a target namespace identifier.
- Loader to PluginRegistry: discover — The registry returns every plugin registered under
PluginType.VOCABULARY_PROVIDERS. - Loader to Plugin: get_namespace (loop) — Each plugin is asked which namespace it contributes; only matches proceed.
- Loader to Plugin: get_class_map — Matching plugins return their full mapping as a dict of class-name to target type.
- Loader to VocabularyMapping: validate (loop) — Each entry is validated against the
VocabularyMappingfrozen dataclass contract. - Loader to MappingStore: add — Valid entries are appended to the in-memory mapping store under the namespace key.
- Loader to Logger: info(missing_source_ids) — Entries that fail validation are logged at info level with a structured
missing_source_idspayload. - Loader to EventBus: publish(vocabulary.mismatch) — A mismatch event fires per failed row so subscribers can aggregate or alert.
- Loader to EventBus: publish(vocabulary.loaded) — Once a plugin's map has been fully processed, a single
vocabulary.loadedevent summarises the namespace. - Loader to Caller: dict[str, type] — The merged class map for the requested namespace is returned.
See also¶
- Entry point:
src/fcc/objectmodel/vocabulary_loader.py - Plugin contract:
src/fcc/plugins/base.py(VocabularyProviderPlugin) - Related class diagram:
../class-diagrams/object-model.md - Related event types:
src/fcc/messaging/events.py—EventType.VOCABULARY_LOADED,EventType.VOCABULARY_MISMATCH