Benchmark Assessment Demo¶

Demo Overview

Duration: 35-45 minutes | Level: Advanced | Category: Assessment

Evaluate knowledge graph completeness and object model maturity using the FCC benchmark assessment system.

Prerequisites¶

FCC framework installed (pip install -e .)
Completed the Knowledge Graph Demo

What You'll Learn¶

How to build a complete knowledge graph with all 9 builders
How to run KG benchmark assessments and interpret reports
How to serialize reports for CI integration
How to evaluate object model maturity
How to compare assessments against baselines

Step 1: Build the Full Knowledge Graph¶

Construct a comprehensive knowledge graph using all 9 builders.

Knowledge graph benchmark dashboard

from fcc._resources import get_personas_dir
from fcc.personas.registry import PersonaRegistry
from fcc.knowledge.builders import build_full_fcc_graph

registry = PersonaRegistry.from_yaml_directory(get_personas_dir())
graph = build_full_fcc_graph(persona_registry=registry)

print(f"Total nodes: {graph.node_count}")
print(f"Total edges: {graph.edge_count}")

stats = graph.stats()
for key, value in sorted(stats.items()):
    print(f"  {key}: {value}")

Expected output: Knowledge graph with hundreds of nodes across all 9 types.

Try it yourself

Pass additional registries to build_full_fcc_graph() for richer graphs.

Step 2: Run KG Benchmark Assessment¶

Evaluate the knowledge graph's completeness and connectivity.

from fcc.knowledge.benchmark import assess_knowledge_graph

report = assess_knowledge_graph(graph)
print(f"Overall score: {report.overall_score:.1%}")
print(f"Total nodes: {report.total_nodes}")
print(f"Total edges: {report.total_edges}")
print(f"Connectivity: {report.connectivity_score:.2f} edges/node")
print(f"Orphan nodes: {report.orphan_count}")

print("\nNode type coverage:")
for nt, count in sorted(report.node_counts_by_type.items()):
    status = "covered" if count > 0 else "MISSING"
    print(f"  {nt}: {count} ({status})")

print("\nEdge type coverage:")
for et, count in sorted(report.edge_counts_by_type.items()):
    status = "covered" if count > 0 else "MISSING"
    print(f"  {et}: {count} ({status})")

Expected output: Report with coverage percentages and connectivity metrics.

Step 3: Review Recommendations¶

Check the benchmark report's recommendations.

print("Recommendations:")
for rec in report.recommendations:
    print(f"  - {rec}")

Expected output: Actionable recommendations for improving graph coverage.

Troubleshooting

If you see "Add builders for missing node types", you may need to pass additional registries when building the graph.

Step 4: Serialize Benchmark Report¶

Save the report for CI integration and historical comparison.

from fcc.knowledge.benchmark import serialize_report, load_report

# Save as YAML
serialize_report(report, "kg_benchmark_report.yaml", fmt="yaml")

# Save as JSON
serialize_report(report, "kg_benchmark_report.json", fmt="json")

# Load back
loaded = load_report("kg_benchmark_report.yaml")
assert loaded.overall_score == report.overall_score
print("Report serialization verified.")

Expected output: Report files written and round-trip verified.

Step 5: Render Benchmark Dashboard¶

Display the benchmark results as a terminal dashboard.

from fcc.dashboard.benchmark import render_kg_benchmark_dashboard

output = render_kg_benchmark_dashboard(report)
print(output)

Expected output: ASCII dashboard with coverage tables and score visualization.

Step 6: Run Object Model Assessment¶

Evaluate the FCC object model maturity across 7 dimensions.

from fcc.objectmodel.assessment_runner import AssessmentRunner

runner = AssessmentRunner()
assessment = runner.assess_fcc_model()

print(f"Overall score: {assessment.aggregate_score:.1%}")
print(f"Evolution stage: {assessment.stage.value}")

print("\nDimension scores:")
for dim in assessment.dimensions:
    print(f"  {dim.dimension}: {dim.score:.1%}")

print("\nRecommendations:")
for rec in assessment.recommendations:
    print(f"  - {rec}")

Expected output: Assessment with per-dimension scores and evolution stage classification.

Step 7: Compare Against Baseline¶

Load the baseline and compare.

from fcc.knowledge.benchmark import load_report

# Load baseline
baseline = load_report("src/fcc/data/knowledge/kg_benchmark_baseline.yaml")

print(f"Baseline score: {baseline.overall_score:.1%}")
print(f"Current score:  {report.overall_score:.1%}")

delta = report.overall_score - baseline.overall_score
direction = "improved" if delta > 0 else "regressed" if delta < 0 else "unchanged"
print(f"Delta: {delta:+.1%} ({direction})")

Expected output: Comparison showing improvement or regression since baseline.

Summary¶

In this demo, you explored the FCC benchmark assessment system:

Built a complete knowledge graph with 9 builders
Ran KG benchmark assessment and reviewed coverage
Serialized reports for CI integration
Evaluated object model maturity across 7 dimensions
Compared assessments against baseline

Next Steps¶

Explore Benchmark Interpretation Guide
Read about Knowledge Graph Export Recipes
Try the Compliance Dashboard Demo