Skip to content

Cross-Project Federation

Duration: 75 minutes Difficulty: Advanced Pattern: Federated Team + Cross-Domain Bridge

This scenario demonstrates constructing a federated knowledge graph across multiple projects, with entity resolution, namespace registration, and cross-project collaboration.

Scenario Overview

Problem: Three projects in the FCC ecosystem need to share knowledge about data standards, but each uses different terminology and schema conventions. A federated knowledge graph must bridge these differences.

Goal: Register project namespaces, resolve entities across vocabularies, construct individual and federated knowledge graphs, and link entities with cross-namespace edges.

Persona Team

Persona ID Role Category
Standards Compliance STC Defines standards and compliance mappings governance
Integration Specialist ILS Bridges cross-project integration integration
Open Access Advocate OAA Ensures open access and interoperability open_science
Schema Design Expert SDE Designs cross-project schema alignment protocol_engineering

Setup

from fcc.personas.registry import PersonaRegistry
from fcc.simulation.engine import SimulationEngine
from fcc.simulation.messages import SimulationMessage
from fcc.messaging.bus import EventBus
from fcc.federation.registry import FederationRegistry
from fcc.federation.namespaces import NamespaceConfig
from fcc.federation.entities import EntityResolver
from fcc.federation.change_tracking import ChangeTracker, ModelChange
from fcc.objectmodel.mapping import VocabularyMapping
from fcc.knowledge.graph import KnowledgeGraph
from fcc.knowledge.models import KnowledgeNode, KnowledgeEdge, NodeType, EdgeType
from fcc.knowledge.federation import FederatedKnowledgeGraph

registry = PersonaRegistry.from_yaml_directory("src/fcc/data/personas")
bus = EventBus()
engine = SimulationEngine(registry=registry, mode="deterministic")

Phase 1: Namespace Registration

The Standards Compliance persona defines project namespaces:

# Define namespace configurations for three projects
namespaces = {
    "fcc": NamespaceConfig(
        namespace="fcc", prefix="fcc",
        base_uri="https://fcc.example.org/ontology/",
        version="0.8.0",
        description="FCC Agent Team Framework",
    ),
    "research_hub": NamespaceConfig(
        namespace="research_hub", prefix="rh",
        base_uri="https://research-hub.example.org/ontology/",
        version="2.1.0",
        description="Research collaboration platform",
    ),
    "data_catalog": NamespaceConfig(
        namespace="data_catalog", prefix="dc",
        base_uri="https://data-catalog.example.org/ontology/",
        version="1.5.0",
        description="Enterprise data catalog",
    ),
}

# Register all projects in the federation
federation = FederationRegistry()
for ns_id, ns_config in namespaces.items():
    federation.add_project(ns_id, namespace_config=ns_config)

print(f"Registered projects: {federation.all_projects()}")
print(f"Namespace count: {federation.namespace_registry.count()}")

# STC defines the standards alignment scope
stc_result = engine.step(SimulationMessage(
    sender="orchestrator", receiver="STC",
    content=(
        "Define the standards alignment scope for three projects: "
        "FCC, Research Hub, and Data Catalog. "
        "Identify shared concepts: personas, workflows, data assets, "
        "and governance policies that need cross-project alignment."
    ),
    phase="find",
))
print(f"STC alignment scope: {len(stc_result.content)} chars")

Phase 2: Cross-Project Entity Resolution

The Integration Specialist maps entities across vocabularies:

# Define vocabulary mappings between projects
mappings = [
    VocabularyMapping(
        source_id="RC", source_name="Research Crafter",
        source_vocabulary="fcc",
        target_id="researcher", target_name="Research Analyst",
        target_vocabulary="research_hub", similarity_score=0.87,
    ),
    VocabularyMapping(
        source_id="DGS", source_name="Data Governance Steward",
        source_vocabulary="fcc",
        target_id="data_steward", target_name="Data Steward",
        target_vocabulary="data_catalog", similarity_score=0.92,
    ),
    VocabularyMapping(
        source_id="CIA", source_name="Catalog Indexer Analyst",
        source_vocabulary="fcc",
        target_id="cataloger", target_name="Data Cataloger",
        target_vocabulary="data_catalog", similarity_score=0.85,
    ),
    VocabularyMapping(
        source_id="FDS", source_name="FAIR Data Steward",
        source_vocabulary="fcc",
        target_id="fair_manager", target_name="FAIR Compliance Manager",
        target_vocabulary="research_hub", similarity_score=0.80,
    ),
]

# Register mappings with the federation resolver
for mapping in mappings:
    federation.entity_resolver.add_mapping(mapping)

print(f"Registered mappings: {federation.entity_resolver.mapping_count()}")

# ILS resolves entities across all projects
ils_result = engine.step(SimulationMessage(
    sender="STC", receiver="ILS",
    content=(
        "Map the following FCC entities to their equivalents in "
        "Research Hub and Data Catalog: RC, DGS, CIA, FDS. "
        "Document confidence scores and any terminology gaps."
    ),
    phase="create",
))

# Demonstrate resolution
for entity_id in ["RC", "DGS", "CIA", "FDS"]:
    resolved = federation.resolve_across_projects(entity_id, "fcc")
    for r in resolved:
        print(f"  {entity_id} -> {r.canonical_id} "
              f"(confidence={r.confidence:.0%})")

Phase 3: Knowledge Graph Construction

Build individual knowledge graphs for each project:

# FCC knowledge graph (from persona registry)
from fcc.knowledge.builders import build_persona_graph

fcc_graph = build_persona_graph(registry)
print(f"FCC graph: {fcc_graph.node_count} nodes, "
      f"{fcc_graph.edge_count} edges")

# Research Hub knowledge graph (simulated)
rh_graph = KnowledgeGraph()
rh_entities = [
    ("researcher", "Research Analyst"),
    ("fair_manager", "FAIR Compliance Manager"),
    ("pub_coordinator", "Publication Coordinator"),
]
for eid, label in rh_entities:
    rh_graph.add_node(KnowledgeNode(
        node_id=eid, node_type=NodeType.PERSONA,
        label=label, namespace="research_hub",
    ))
rh_graph.add_edge(KnowledgeEdge(
    source_id="researcher", target_id="fair_manager",
    edge_type=EdgeType.INTERACTS_WITH,
))
print(f"Research Hub graph: {rh_graph.node_count} nodes, "
      f"{rh_graph.edge_count} edges")

# Data Catalog knowledge graph (simulated)
dc_graph = KnowledgeGraph()
dc_entities = [
    ("data_steward", "Data Steward"),
    ("cataloger", "Data Cataloger"),
    ("schema_admin", "Schema Administrator"),
]
for eid, label in dc_entities:
    dc_graph.add_node(KnowledgeNode(
        node_id=eid, node_type=NodeType.PERSONA,
        label=label, namespace="data_catalog",
    ))
dc_graph.add_edge(KnowledgeEdge(
    source_id="data_steward", target_id="cataloger",
    edge_type=EdgeType.ORCHESTRATES,
))
print(f"Data Catalog graph: {dc_graph.node_count} nodes, "
      f"{dc_graph.edge_count} edges")

Phase 4: Federated Knowledge Graph

Merge all graphs with cross-namespace edges:

# Create the federated knowledge graph
fkg = FederatedKnowledgeGraph(local_namespace="fcc")
fkg.register_local(fcc_graph)
fkg.register_remote("research_hub", rh_graph)
fkg.register_remote("data_catalog", dc_graph)

# Add cross-namespace edges based on entity resolution
for mapping in mappings:
    fkg.add_cross_edge(
        source_namespace=mapping.source_vocabulary,
        source_id=mapping.source_id,
        target_namespace=mapping.target_vocabulary,
        target_id=mapping.target_id,
        edge_type=EdgeType.FEDERATION_LINK,
    )

print(f"\nFederated graph namespaces: {fkg.namespaces()}")
print(f"Cross-namespace edges: {len(fkg.cross_project_edges())}")

# Merge into a single unified graph
unified = fkg.merge_federated()
print(f"Unified graph: {unified.node_count} nodes, "
      f"{unified.edge_count} edges")

# Get federation statistics
stats = fkg.stats()
for key, value in sorted(stats.items()):
    print(f"  {key}: {value}")

Phase 5: Schema Alignment

The Schema Design Expert validates the alignment:

sde_result = engine.step(SimulationMessage(
    sender="ILS", receiver="SDE",
    content=(
        f"Validate the cross-project schema alignment:\n"
        f"- FCC: {fcc_graph.node_count} nodes, {fcc_graph.edge_count} edges\n"
        f"- Research Hub: {rh_graph.node_count} nodes\n"
        f"- Data Catalog: {dc_graph.node_count} nodes\n"
        f"- Cross edges: {len(fkg.cross_project_edges())}\n\n"
        "Check for: schema conflicts, unmapped entities, "
        "type mismatches, and missing relationships. "
        "Produce a validation report."
    ),
    phase="critique",
))
print(f"Schema validation: {len(sde_result.content)} chars")

Phase 6: Change Tracking

Record the federation activities for audit:

tracker = federation.change_tracker

for mapping in mappings:
    tracker.track_change(ModelChange(
        entity_id=mapping.source_id,
        change_type="added",
        namespace="federation",
        new_value={
            "mapping": f"{mapping.source_vocabulary}:{mapping.source_id} -> "
                       f"{mapping.target_vocabulary}:{mapping.target_id}",
            "confidence": mapping.similarity_score,
        },
    ))

changeset = tracker.create_changeset(
    description="Initial cross-project federation mapping",
    source_namespace="fcc",
)
print(f"\nChange set: {len(changeset.changes)} changes")
print(f"  Added: {changeset.added_count}")

# Health check
health = federation.health_check()
print(f"\nFederation health:")
print(f"  Projects: {health['project_count']}")
print(f"  Mappings: {health['mapping_count']}")
print(f"  Change history: {health['change_history_count']}")

Exercises

  1. Export federated graph: Serialize the unified graph to JSON-LD and Turtle formats for sharing with external systems.
  2. Bidirectional mapping: Create reverse mappings (research_hub -> fcc) and verify bidirectional resolution.
  3. Conflict resolution: Add conflicting mappings and implement a resolution strategy based on confidence scores.
  4. RAG over federation: Build a RAG pipeline that indexes all three project graphs and answers cross-project questions.

Summary

In this scenario you executed a cross-project federation:

  • STC defined standards alignment scope across three projects
  • ILS mapped entities across vocabularies with confidence scores
  • Individual knowledge graphs were constructed for each project
  • A federated knowledge graph merged all graphs with cross-namespace edges
  • SDE validated schema alignment
  • Change tracking provided audit trail for all federation activities

Next Steps