Cross-Project Federation¶
Duration: 75 minutes Difficulty: Advanced Pattern: Federated Team + Cross-Domain Bridge
This scenario demonstrates constructing a federated knowledge graph across multiple projects, with entity resolution, namespace registration, and cross-project collaboration.
Scenario Overview¶
Problem: Three projects in the FCC ecosystem need to share knowledge about data standards, but each uses different terminology and schema conventions. A federated knowledge graph must bridge these differences.
Goal: Register project namespaces, resolve entities across vocabularies, construct individual and federated knowledge graphs, and link entities with cross-namespace edges.
Persona Team¶
| Persona | ID | Role | Category |
|---|---|---|---|
| Standards Compliance | STC | Defines standards and compliance mappings | governance |
| Integration Specialist | ILS | Bridges cross-project integration | integration |
| Open Access Advocate | OAA | Ensures open access and interoperability | open_science |
| Schema Design Expert | SDE | Designs cross-project schema alignment | protocol_engineering |
Setup¶
from fcc.personas.registry import PersonaRegistry
from fcc.simulation.engine import SimulationEngine
from fcc.simulation.messages import SimulationMessage
from fcc.messaging.bus import EventBus
from fcc.federation.registry import FederationRegistry
from fcc.federation.namespaces import NamespaceConfig
from fcc.federation.entities import EntityResolver
from fcc.federation.change_tracking import ChangeTracker, ModelChange
from fcc.objectmodel.mapping import VocabularyMapping
from fcc.knowledge.graph import KnowledgeGraph
from fcc.knowledge.models import KnowledgeNode, KnowledgeEdge, NodeType, EdgeType
from fcc.knowledge.federation import FederatedKnowledgeGraph
registry = PersonaRegistry.from_yaml_directory("src/fcc/data/personas")
bus = EventBus()
engine = SimulationEngine(registry=registry, mode="deterministic")
Phase 1: Namespace Registration¶
The Standards Compliance persona defines project namespaces:
# Define namespace configurations for three projects
namespaces = {
"fcc": NamespaceConfig(
namespace="fcc", prefix="fcc",
base_uri="https://fcc.example.org/ontology/",
version="0.8.0",
description="FCC Agent Team Framework",
),
"research_hub": NamespaceConfig(
namespace="research_hub", prefix="rh",
base_uri="https://research-hub.example.org/ontology/",
version="2.1.0",
description="Research collaboration platform",
),
"data_catalog": NamespaceConfig(
namespace="data_catalog", prefix="dc",
base_uri="https://data-catalog.example.org/ontology/",
version="1.5.0",
description="Enterprise data catalog",
),
}
# Register all projects in the federation
federation = FederationRegistry()
for ns_id, ns_config in namespaces.items():
federation.add_project(ns_id, namespace_config=ns_config)
print(f"Registered projects: {federation.all_projects()}")
print(f"Namespace count: {federation.namespace_registry.count()}")
# STC defines the standards alignment scope
stc_result = engine.step(SimulationMessage(
sender="orchestrator", receiver="STC",
content=(
"Define the standards alignment scope for three projects: "
"FCC, Research Hub, and Data Catalog. "
"Identify shared concepts: personas, workflows, data assets, "
"and governance policies that need cross-project alignment."
),
phase="find",
))
print(f"STC alignment scope: {len(stc_result.content)} chars")
Phase 2: Cross-Project Entity Resolution¶
The Integration Specialist maps entities across vocabularies:
# Define vocabulary mappings between projects
mappings = [
VocabularyMapping(
source_id="RC", source_name="Research Crafter",
source_vocabulary="fcc",
target_id="researcher", target_name="Research Analyst",
target_vocabulary="research_hub", similarity_score=0.87,
),
VocabularyMapping(
source_id="DGS", source_name="Data Governance Steward",
source_vocabulary="fcc",
target_id="data_steward", target_name="Data Steward",
target_vocabulary="data_catalog", similarity_score=0.92,
),
VocabularyMapping(
source_id="CIA", source_name="Catalog Indexer Analyst",
source_vocabulary="fcc",
target_id="cataloger", target_name="Data Cataloger",
target_vocabulary="data_catalog", similarity_score=0.85,
),
VocabularyMapping(
source_id="FDS", source_name="FAIR Data Steward",
source_vocabulary="fcc",
target_id="fair_manager", target_name="FAIR Compliance Manager",
target_vocabulary="research_hub", similarity_score=0.80,
),
]
# Register mappings with the federation resolver
for mapping in mappings:
federation.entity_resolver.add_mapping(mapping)
print(f"Registered mappings: {federation.entity_resolver.mapping_count()}")
# ILS resolves entities across all projects
ils_result = engine.step(SimulationMessage(
sender="STC", receiver="ILS",
content=(
"Map the following FCC entities to their equivalents in "
"Research Hub and Data Catalog: RC, DGS, CIA, FDS. "
"Document confidence scores and any terminology gaps."
),
phase="create",
))
# Demonstrate resolution
for entity_id in ["RC", "DGS", "CIA", "FDS"]:
resolved = federation.resolve_across_projects(entity_id, "fcc")
for r in resolved:
print(f" {entity_id} -> {r.canonical_id} "
f"(confidence={r.confidence:.0%})")
Phase 3: Knowledge Graph Construction¶
Build individual knowledge graphs for each project:
# FCC knowledge graph (from persona registry)
from fcc.knowledge.builders import build_persona_graph
fcc_graph = build_persona_graph(registry)
print(f"FCC graph: {fcc_graph.node_count} nodes, "
f"{fcc_graph.edge_count} edges")
# Research Hub knowledge graph (simulated)
rh_graph = KnowledgeGraph()
rh_entities = [
("researcher", "Research Analyst"),
("fair_manager", "FAIR Compliance Manager"),
("pub_coordinator", "Publication Coordinator"),
]
for eid, label in rh_entities:
rh_graph.add_node(KnowledgeNode(
node_id=eid, node_type=NodeType.PERSONA,
label=label, namespace="research_hub",
))
rh_graph.add_edge(KnowledgeEdge(
source_id="researcher", target_id="fair_manager",
edge_type=EdgeType.INTERACTS_WITH,
))
print(f"Research Hub graph: {rh_graph.node_count} nodes, "
f"{rh_graph.edge_count} edges")
# Data Catalog knowledge graph (simulated)
dc_graph = KnowledgeGraph()
dc_entities = [
("data_steward", "Data Steward"),
("cataloger", "Data Cataloger"),
("schema_admin", "Schema Administrator"),
]
for eid, label in dc_entities:
dc_graph.add_node(KnowledgeNode(
node_id=eid, node_type=NodeType.PERSONA,
label=label, namespace="data_catalog",
))
dc_graph.add_edge(KnowledgeEdge(
source_id="data_steward", target_id="cataloger",
edge_type=EdgeType.ORCHESTRATES,
))
print(f"Data Catalog graph: {dc_graph.node_count} nodes, "
f"{dc_graph.edge_count} edges")
Phase 4: Federated Knowledge Graph¶
Merge all graphs with cross-namespace edges:
# Create the federated knowledge graph
fkg = FederatedKnowledgeGraph(local_namespace="fcc")
fkg.register_local(fcc_graph)
fkg.register_remote("research_hub", rh_graph)
fkg.register_remote("data_catalog", dc_graph)
# Add cross-namespace edges based on entity resolution
for mapping in mappings:
fkg.add_cross_edge(
source_namespace=mapping.source_vocabulary,
source_id=mapping.source_id,
target_namespace=mapping.target_vocabulary,
target_id=mapping.target_id,
edge_type=EdgeType.FEDERATION_LINK,
)
print(f"\nFederated graph namespaces: {fkg.namespaces()}")
print(f"Cross-namespace edges: {len(fkg.cross_project_edges())}")
# Merge into a single unified graph
unified = fkg.merge_federated()
print(f"Unified graph: {unified.node_count} nodes, "
f"{unified.edge_count} edges")
# Get federation statistics
stats = fkg.stats()
for key, value in sorted(stats.items()):
print(f" {key}: {value}")
Phase 5: Schema Alignment¶
The Schema Design Expert validates the alignment:
sde_result = engine.step(SimulationMessage(
sender="ILS", receiver="SDE",
content=(
f"Validate the cross-project schema alignment:\n"
f"- FCC: {fcc_graph.node_count} nodes, {fcc_graph.edge_count} edges\n"
f"- Research Hub: {rh_graph.node_count} nodes\n"
f"- Data Catalog: {dc_graph.node_count} nodes\n"
f"- Cross edges: {len(fkg.cross_project_edges())}\n\n"
"Check for: schema conflicts, unmapped entities, "
"type mismatches, and missing relationships. "
"Produce a validation report."
),
phase="critique",
))
print(f"Schema validation: {len(sde_result.content)} chars")
Phase 6: Change Tracking¶
Record the federation activities for audit:
tracker = federation.change_tracker
for mapping in mappings:
tracker.track_change(ModelChange(
entity_id=mapping.source_id,
change_type="added",
namespace="federation",
new_value={
"mapping": f"{mapping.source_vocabulary}:{mapping.source_id} -> "
f"{mapping.target_vocabulary}:{mapping.target_id}",
"confidence": mapping.similarity_score,
},
))
changeset = tracker.create_changeset(
description="Initial cross-project federation mapping",
source_namespace="fcc",
)
print(f"\nChange set: {len(changeset.changes)} changes")
print(f" Added: {changeset.added_count}")
# Health check
health = federation.health_check()
print(f"\nFederation health:")
print(f" Projects: {health['project_count']}")
print(f" Mappings: {health['mapping_count']}")
print(f" Change history: {health['change_history_count']}")
Exercises¶
- Export federated graph: Serialize the unified graph to JSON-LD and Turtle formats for sharing with external systems.
- Bidirectional mapping: Create reverse mappings (research_hub -> fcc) and verify bidirectional resolution.
- Conflict resolution: Add conflicting mappings and implement a resolution strategy based on confidence scores.
- RAG over federation: Build a RAG pipeline that indexes all three project graphs and answers cross-project questions.
Summary¶
In this scenario you executed a cross-project federation:
- STC defined standards alignment scope across three projects
- ILS mapped entities across vocabularies with confidence scores
- Individual knowledge graphs were constructed for each project
- A federated knowledge graph merged all graphs with cross-namespace edges
- SDE validated schema alignment
- Change tracking provided audit trail for all federation activities
Next Steps¶
- Federation Tutorial -- Deep dive into the federation module
- Knowledge Graphs Tutorial -- Advanced graph operations