Skip to content

Benchmark Assessment Demo

Demo Overview

Duration: 35-45 minutes | Level: Advanced | Category: Assessment

Evaluate knowledge graph completeness and object model maturity using the FCC benchmark assessment system.

Prerequisites

What You'll Learn

  • How to build a complete knowledge graph with all 9 builders
  • How to run KG benchmark assessments and interpret reports
  • How to serialize reports for CI integration
  • How to evaluate object model maturity
  • How to compare assessments against baselines

Step 1: Build the Full Knowledge Graph

Construct a comprehensive knowledge graph using all 9 builders.

Knowledge graph benchmark dashboard

from fcc._resources import get_personas_dir
from fcc.personas.registry import PersonaRegistry
from fcc.knowledge.builders import build_full_fcc_graph

registry = PersonaRegistry.from_yaml_directory(get_personas_dir())
graph = build_full_fcc_graph(persona_registry=registry)

print(f"Total nodes: {graph.node_count}")
print(f"Total edges: {graph.edge_count}")

stats = graph.stats()
for key, value in sorted(stats.items()):
    print(f"  {key}: {value}")

Expected output: Knowledge graph with hundreds of nodes across all 9 types.

Try it yourself

Pass additional registries to build_full_fcc_graph() for richer graphs.


Step 2: Run KG Benchmark Assessment

Evaluate the knowledge graph's completeness and connectivity.

from fcc.knowledge.benchmark import assess_knowledge_graph

report = assess_knowledge_graph(graph)
print(f"Overall score: {report.overall_score:.1%}")
print(f"Total nodes: {report.total_nodes}")
print(f"Total edges: {report.total_edges}")
print(f"Connectivity: {report.connectivity_score:.2f} edges/node")
print(f"Orphan nodes: {report.orphan_count}")

print("\nNode type coverage:")
for nt, count in sorted(report.node_counts_by_type.items()):
    status = "covered" if count > 0 else "MISSING"
    print(f"  {nt}: {count} ({status})")

print("\nEdge type coverage:")
for et, count in sorted(report.edge_counts_by_type.items()):
    status = "covered" if count > 0 else "MISSING"
    print(f"  {et}: {count} ({status})")

Expected output: Report with coverage percentages and connectivity metrics.


Step 3: Review Recommendations

Check the benchmark report's recommendations.

print("Recommendations:")
for rec in report.recommendations:
    print(f"  - {rec}")

Expected output: Actionable recommendations for improving graph coverage.

Troubleshooting

If you see "Add builders for missing node types", you may need to pass additional registries when building the graph.


Step 4: Serialize Benchmark Report

Save the report for CI integration and historical comparison.

from fcc.knowledge.benchmark import serialize_report, load_report

# Save as YAML
serialize_report(report, "kg_benchmark_report.yaml", fmt="yaml")

# Save as JSON
serialize_report(report, "kg_benchmark_report.json", fmt="json")

# Load back
loaded = load_report("kg_benchmark_report.yaml")
assert loaded.overall_score == report.overall_score
print("Report serialization verified.")

Expected output: Report files written and round-trip verified.


Step 5: Render Benchmark Dashboard

Display the benchmark results as a terminal dashboard.

from fcc.dashboard.benchmark import render_kg_benchmark_dashboard

output = render_kg_benchmark_dashboard(report)
print(output)

Expected output: ASCII dashboard with coverage tables and score visualization.


Step 6: Run Object Model Assessment

Evaluate the FCC object model maturity across 7 dimensions.

from fcc.objectmodel.assessment_runner import AssessmentRunner

runner = AssessmentRunner()
assessment = runner.assess_fcc_model()

print(f"Overall score: {assessment.aggregate_score:.1%}")
print(f"Evolution stage: {assessment.stage.value}")

print("\nDimension scores:")
for dim in assessment.dimensions:
    print(f"  {dim.dimension}: {dim.score:.1%}")

print("\nRecommendations:")
for rec in assessment.recommendations:
    print(f"  - {rec}")

Expected output: Assessment with per-dimension scores and evolution stage classification.


Step 7: Compare Against Baseline

Load the baseline and compare.

from fcc.knowledge.benchmark import load_report

# Load baseline
baseline = load_report("src/fcc/data/knowledge/kg_benchmark_baseline.yaml")

print(f"Baseline score: {baseline.overall_score:.1%}")
print(f"Current score:  {report.overall_score:.1%}")

delta = report.overall_score - baseline.overall_score
direction = "improved" if delta > 0 else "regressed" if delta < 0 else "unchanged"
print(f"Delta: {delta:+.1%} ({direction})")

Expected output: Comparison showing improvement or regression since baseline.


Summary

In this demo, you explored the FCC benchmark assessment system:

  • Built a complete knowledge graph with 9 builders
  • Ran KG benchmark assessment and reviewed coverage
  • Serialized reports for CI integration
  • Evaluated object model maturity across 7 dimensions
  • Compared assessments against baseline

Next Steps