ADR-003: Knowledge Graph Export¶
Date: 2026-03-29 Status: Accepted
Context¶
FCC workflows produce structured artifacts with rich metadata: which persona created the artifact, in which session, at which workflow node, with which quality gate results. To enable provenance queries, compliance auditing, and cross-project knowledge sharing, we need a knowledge representation that captures entities and their relationships as a queryable graph.
We evaluated four approaches:
- rdflib (Python RDF library). A full-featured RDF library that supports OWL, RDFS, SPARQL, and multiple serialization formats.
- NetworkX. A general-purpose graph library. Not RDF-specific but widely used in Python.
- Neo4j / property graphs. A graph database with its own query language (Cypher).
- Pure-Python serializers. Custom, lightweight serializers that output standard RDF formats (Turtle, N-Triples, JSON-LD) without depending on rdflib.
Key requirements:
- Must represent the FCC domain model faithfully (personas, workflows, artifacts, sessions, quality gates).
- Must support both full ontology (OWL) and lightweight taxonomy (SKOS) representations.
- Must serialize to standard formats for interoperability with other tools and projects.
- Must not add heavy dependencies (rdflib requires C extensions on some platforms).
- Must be testable with the existing mock infrastructure.
Decision¶
We will use OWL for full ontology, SKOS for taxonomy, and pure-Python serializers for knowledge graph export.
The implementation consists of:
FCCOntology-- a Python class that builds a knowledge graph from FCC runtime data (personas, traces, sessions). Internally represents triples as a list of(subject, predicate, object)tuples.- Two-layer design:
- OWL layer: Defines the class hierarchy (Persona, Workflow, Artifact, Session, QualityGate, Constitution), object properties (createdBy, reviewedBy, satisfies, governedBy), and data properties (hasPhase, hasCategory, hasScore, hasTimestamp).
- SKOS layer: Defines controlled vocabularies for persona categories, FCC phases, action types, tag hierarchies, and archetype vocabularies.
- Pure-Python serializers: Custom serializers for Turtle (.ttl), N-Triples (.nt), and JSON-LD (.jsonld) formats. These serializers implement the subset of the RDF serialization specifications needed for FCC's ontology, without depending on rdflib.
- Query API: A Python query interface that supports subject/predicate/object pattern matching, provenance chain traversal, and coverage analysis.
Consequences¶
Positive¶
- Standard formats. Turtle, N-Triples, and JSON-LD are W3C-standard formats that any RDF tool can import. The knowledge graph is interoperable with Protege, Apache Jena, Stardog, and other RDF ecosystems.
- Two-layer flexibility. OWL handles complex reasoning (compliance, provenance inference). SKOS handles lightweight classification (persona browsing, tag navigation). Users choose the layer appropriate for their query.
- Minimal dependencies. Pure-Python serializers avoid rdflib's C extension compilation issues. The knowledge graph module has zero additional dependencies beyond Python's standard library.
- Testability. The mock infrastructure can generate test triples without external services. All serialization formats can be round-tripped (serialize then parse) for validation.
- Federation-ready. Standard RDF formats with namespaced IRIs are the natural foundation for the federated knowledge graphs in ADR-005.
Negative¶
- Limited serializer coverage. The pure-Python serializers handle the subset of RDF needed for FCC, not the full specification. Unusual RDF constructs (blank nodes, reification, named graphs) are not supported.
- No SPARQL engine. The query API uses pattern matching, not SPARQL. Users who need full SPARQL must export the graph and load it into an external triple store.
- Maintenance burden. Maintaining custom serializers means fixing format compliance issues ourselves rather than relying on rdflib's well-tested implementations.
Mitigations¶
- The limited serializer coverage is documented, and users who need full RDF support can export to Turtle and load into rdflib or an external triple store.
- The query API covers the most common use cases (provenance chain, coverage analysis, compliance checking). SPARQL is available via export.
- Serializer compliance is verified by round-trip tests and validation against W3C test suites for the supported subset.