Chapter 4: Federated Knowledge¶
Learning Objectives¶
By the end of this chapter you will be able to:
- Explain why knowledge federation is necessary for multi-project ecosystems.
- Describe the namespace IRI scheme and how it prevents identifier collisions.
- Use the NamespaceRegistry to register and resolve project namespaces.
- Federate knowledge graphs across FCC, AOME, CONSTEL, and other projects.
- Query federated knowledge graphs using cross-project identifiers.
The figure below shows how per-project namespaces and per-project knowledge graphs are stitched together by an EntityResolver into a single FederatedKnowledgeGraph that supports cross-project queries.
flowchart TB
subgraph NS["Namespace Registry"]
FCC_NS["fcc: https://fcc.example.org/"]
AOME_NS["aome: https://aome.example.org/"]
CONSTEL_NS["constel: https://constel.example.org/"]
end
subgraph Graphs["Project Knowledge Graphs"]
FCC_KG[(FCC KG)]
AOME_KG[(AOME KG)]
CONSTEL_KG[(CONSTEL KG)]
end
ER[EntityResolver] --> FCC_KG
ER --> AOME_KG
ER --> CONSTEL_KG
FCC_KG --> FED[(FederatedKnowledgeGraph)]
AOME_KG --> FED
CONSTEL_KG --> FED
FED --> QUERY[Cross-Project Queries]
FED --> XEDGE[Cross-Namespace Edges]
style FED fill:#2196F3,color:#fff
style QUERY fill:#4CAF50,color:#fff
Because namespaces are resolved at query time rather than ingest time, new projects can join the federation without rematerialising existing knowledge graphs.
The Federation Problem¶
In a single-project FCC deployment, the knowledge graph (Chapter 2) and search index (Chapter 1) are self-contained. All entities use the same namespace, all identifiers are unique, and all queries are local.
In a multi-project ecosystem, this breaks down. AOME has its own knowledge graph with privacy-related entities. CONSTEL has its own metadata graph with cross-project relationships. CTO has its own object model with canonical entity definitions. When a query spans multiple projects ("find all artifacts from FCC sessions that reference AOME-classified personal data"), these knowledge graphs need to be queryable as a single federated graph.
Federation solves this by establishing shared conventions for naming, referencing, and querying entities across project boundaries, without requiring all projects to use the same storage or schema.
Namespace Design¶
The foundation of federation is the namespace IRI scheme (see ADR-005: Federated KG Namespace). Each project in the ecosystem owns a unique namespace IRI:
| Project | Namespace IRI |
|---|---|
| FCC | https://fcc.example.org/ontology/ |
| AOME | https://aome.example.org/ontology/ |
| CONSTEL | https://constel.example.org/ontology/ |
| CTO | https://cto.example.org/ontology/ |
| Sky-Parlour | https://sky-parlour.example.org/ontology/ |
Every entity in a project's knowledge graph is prefixed with that project's namespace:
fcc:research_analyst -- a persona in FCC
aome:privacy_classifier -- a classifier in AOME
constel:metadata_index -- a metadata index in CONSTEL
Why IRIs?¶
IRIs (Internationalized Resource Identifiers) are the standard naming convention for RDF and OWL. They provide:
- Global uniqueness. No two projects can accidentally use the same identifier.
- Dereferenceable. In a web-enabled deployment, the IRI can resolve to the entity's description.
- Standardized. Tools and libraries across the RDF ecosystem understand IRIs natively.
The NamespaceRegistry¶
The NamespaceRegistry manages namespace registration, prefix resolution, and cross-project identifier mapping:
from fcc.knowledge.federation import NamespaceRegistry
registry = NamespaceRegistry()
# Register project namespaces
registry.register("fcc", "https://fcc.example.org/ontology/")
registry.register("aome", "https://aome.example.org/ontology/")
registry.register("constel", "https://constel.example.org/ontology/")
# Resolve a prefixed identifier to a full IRI
full_iri = registry.resolve("fcc:research_analyst")
# "https://fcc.example.org/ontology/research_analyst"
# Compact a full IRI to a prefixed identifier
prefix = registry.compact("https://aome.example.org/ontology/privacy_classifier")
# "aome:privacy_classifier"
The registry is shared across all federation operations. It is typically initialized at startup from a configuration file:
# config/namespaces.yaml
namespaces:
fcc: "https://fcc.example.org/ontology/"
aome: "https://aome.example.org/ontology/"
constel: "https://constel.example.org/ontology/"
cto: "https://cto.example.org/ontology/"
sky_parlour: "https://sky-parlour.example.org/ontology/"
Federating Knowledge Graphs¶
Federation operates at three levels:
Level 1: Cross-Project References¶
The simplest form of federation. One project's knowledge graph references entities from another project using their full IRI:
fcc:artifact_001 fcc:classifiedBy aome:privacy_classifier .
fcc:session_42 constel:indexedIn constel:metadata_index .
These cross-references are created during simulation when FCC interacts with other projects via protocol integration (Chapter 5). No special federation infrastructure is required -- just consistent namespace usage.
Level 2: Federated Query¶
A federated query spans multiple knowledge graphs. The query planner routes sub-queries to the appropriate project's graph and combines the results:
from fcc.knowledge.federation import FederatedQuery
query = FederatedQuery(registry=namespace_registry)
results = query.execute("""
SELECT ?artifact ?persona ?classification
WHERE {
?artifact fcc:createdBy ?persona .
?artifact aome:classifiedAs ?classification .
FILTER (?classification = aome:PersonalData)
}
""")
The query planner:
- Parses the query to identify which namespaces are referenced.
- Routes the
fcc:portions to FCC's knowledge graph. - Routes the
aome:portions to AOME's knowledge graph. - Joins the results on shared identifiers.
Level 3: Merged Graph¶
For complex analytics, merge multiple knowledge graphs into a single unified graph:
from fcc.knowledge.federation import GraphMerger
merger = GraphMerger(registry=namespace_registry)
merged = merger.merge([
fcc_ontology,
aome_ontology,
constel_ontology,
])
# Query the merged graph as a single entity
results = merged.query(subject_type="Artifact", predicate="classifiedAs")
Merging preserves all namespace prefixes, so entities from different projects remain distinguishable. Conflicts (two projects defining the same relationship with different semantics) are detected and reported.
Shared Vocabulary¶
Federation works best when projects share vocabulary for common concepts. The FCC ecosystem defines shared vocabulary terms in a dedicated namespace:
Shared terms include:
shared:createdAt-- ISO 8601 timestampshared:version-- semantic version stringshared:status-- lifecycle status (draft, active, archived)shared:owner-- the project that owns the entityshared:license-- the license under which the entity is published
Using shared vocabulary for cross-cutting concepts ensures that federated queries can join on common fields without project-specific translation.
Federation Patterns¶
Pattern 1: Hub-and-Spoke¶
FCC acts as the hub. All projects publish their knowledge graphs to FCC, which merges and indexes them. Queries go through FCC.
Pros: Simple query routing, single point of coordination. Cons: FCC becomes a bottleneck, single point of failure.
Pattern 2: Peer-to-Peer¶
Each project maintains its own knowledge graph and responds to federated queries directly. A query planner routes sub-queries to the appropriate project.
Pros: No single point of failure, each project controls its own data. Cons: More complex query routing, potential consistency issues.
Pattern 3: CONSTEL as Mediator¶
CONSTEL acts as the federation mediator. Projects publish metadata summaries to CONSTEL, which maintains a global index of what exists where. Full queries are routed to the owning projects.
Pros: Lightweight metadata sharing, projects retain data ownership. Cons: Requires CONSTEL infrastructure, metadata may be stale.
The FCC ecosystem uses Pattern 3 as the default. CONSTEL's metadata indexing is already built for cross-project coordination, making it a natural choice for federation mediation.
Security and Privacy¶
Federated queries can inadvertently expose sensitive information across project boundaries. The federation layer includes access control:
query = FederatedQuery(
registry=namespace_registry,
access_policy={
"aome": ["read_public"], # Only access public AOME data
"fcc": ["read_all"], # Full access to FCC data
},
)
AOME's privacy classifications are particularly important here. If an artifact is classified as containing personal data, federated queries from other projects may be restricted to metadata-only access (the artifact exists and has these properties, but you cannot read its content).
Key Takeaways¶
- Federation enables cross-project knowledge graph queries without centralized storage.
- Namespace IRIs prevent identifier collisions and enable global uniqueness.
- The NamespaceRegistry manages registration, resolution, and compaction of namespaces.
- Three federation levels: cross-project references, federated query, and merged graph.
- The CONSTEL-mediated pattern (hub as metadata mediator) is the default for the FCC ecosystem.
- Access control ensures sensitive data is not inadvertently exposed across project boundaries.
Cross-References¶
- Chapter 5: Protocol Integration -- the transport layer for federation
- Chapter 2: Knowledge Graphs -- single-project knowledge graphs
- ADR-005: Federated KG Namespace -- design rationale
- FCC Guidebook, Chapter 17 -- federation reference
- Book 1, Chapter 6: Ecosystem Overview -- ecosystem projects
← Chapter 3: RAG Pipelines | Next: Chapter 5 -- Protocol Integration →