ADR-003: Knowledge Graph Export¶

Date: 2026-03-29 Status: Accepted

Context¶

FCC workflows produce structured artifacts with rich metadata: which persona created the artifact, in which session, at which workflow node, with which quality gate results. To enable provenance queries, compliance auditing, and cross-project knowledge sharing, we need a knowledge representation that captures entities and their relationships as a queryable graph.

We evaluated four approaches:

rdflib (Python RDF library). A full-featured RDF library that supports OWL, RDFS, SPARQL, and multiple serialization formats.
NetworkX. A general-purpose graph library. Not RDF-specific but widely used in Python.
Neo4j / property graphs. A graph database with its own query language (Cypher).
Pure-Python serializers. Custom, lightweight serializers that output standard RDF formats (Turtle, N-Triples, JSON-LD) without depending on rdflib.

Key requirements:

Must represent the FCC domain model faithfully (personas, workflows, artifacts, sessions, quality gates).
Must support both full ontology (OWL) and lightweight taxonomy (SKOS) representations.
Must serialize to standard formats for interoperability with other tools and projects.
Must not add heavy dependencies (rdflib requires C extensions on some platforms).
Must be testable with the existing mock infrastructure.

Decision¶

We will use OWL for full ontology, SKOS for taxonomy, and pure-Python serializers for knowledge graph export.

The implementation consists of:

FCCOntology -- a Python class that builds a knowledge graph from FCC runtime data (personas, traces, sessions). Internally represents triples as a list of (subject, predicate, object) tuples.
Two-layer design:
OWL layer: Defines the class hierarchy (Persona, Workflow, Artifact, Session, QualityGate, Constitution), object properties (createdBy, reviewedBy, satisfies, governedBy), and data properties (hasPhase, hasCategory, hasScore, hasTimestamp).
SKOS layer: Defines controlled vocabularies for persona categories, FCC phases, action types, tag hierarchies, and archetype vocabularies.
Pure-Python serializers: Custom serializers for Turtle (.ttl), N-Triples (.nt), and JSON-LD (.jsonld) formats. These serializers implement the subset of the RDF serialization specifications needed for FCC's ontology, without depending on rdflib.
Query API: A Python query interface that supports subject/predicate/object pattern matching, provenance chain traversal, and coverage analysis.

Consequences¶

Positive¶

Standard formats. Turtle, N-Triples, and JSON-LD are W3C-standard formats that any RDF tool can import. The knowledge graph is interoperable with Protege, Apache Jena, Stardog, and other RDF ecosystems.
Two-layer flexibility. OWL handles complex reasoning (compliance, provenance inference). SKOS handles lightweight classification (persona browsing, tag navigation). Users choose the layer appropriate for their query.
Minimal dependencies. Pure-Python serializers avoid rdflib's C extension compilation issues. The knowledge graph module has zero additional dependencies beyond Python's standard library.
Testability. The mock infrastructure can generate test triples without external services. All serialization formats can be round-tripped (serialize then parse) for validation.
Federation-ready. Standard RDF formats with namespaced IRIs are the natural foundation for the federated knowledge graphs in ADR-005.

Negative¶

Limited serializer coverage. The pure-Python serializers handle the subset of RDF needed for FCC, not the full specification. Unusual RDF constructs (blank nodes, reification, named graphs) are not supported.
No SPARQL engine. The query API uses pattern matching, not SPARQL. Users who need full SPARQL must export the graph and load it into an external triple store.
Maintenance burden. Maintaining custom serializers means fixing format compliance issues ourselves rather than relying on rdflib's well-tested implementations.

Mitigations¶

The limited serializer coverage is documented, and users who need full RDF support can export to Turtle and load into rdflib or an external triple store.
The query API covers the most common use cases (provenance chain, coverage analysis, compliance checking). SPARQL is available via export.
Serializer compliance is verified by round-trip tests and validation against W3C test suites for the supported subset.