Skip to content

Open Science Demo

This demo walks through the FCC Open Science subsystem -- a FAIR-compliant quality gate and template registry for ensuring reproducibility, data management, and compliance in agent-driven research workflows.


Table of Contents

  1. Introduction
  2. FAIR Principles to FCC Gates Mapping
  3. OpenScienceRegistry API
  4. Checklist Generation
  5. Compliance Scoring
  6. Reproducibility Workflow
  7. OSC Persona Audit

Introduction

The FCC Open Science module (src/fcc/governance/open_science.py) implements the FAIR principles (Findable, Accessible, Interoperable, Reusable) as executable quality gates. It provides:

  • OpenScienceTemplate -- Compliance checklists for data management, reproducibility, and licensing.
  • OpenScienceGate -- Quality gates with mandatory or preferred severity levels and enumerated criteria.
  • OpenScienceRegistry -- A registry that loads templates and gates from YAML, supports filtering, generates markdown checklists, and evaluates FAIR compliance scores.

The data files live in src/fcc/data/governance/:

  • open_science_templates.yaml -- 12 compliance templates
  • open_science_gates.yaml -- 6 quality gates

FAIR Principles to FCC Gates Mapping

Each FAIR principle maps to a specific gate in the FCC open science gate registry. Two additional gates cover reproducibility and general compliance.

FAIR Principle FCC Gate ID Gate Name Category Severity
Findable FAIR-FIND FAIR Findability Gate fair mandatory
Accessible FAIR-ACCESS FAIR Accessibility Gate fair mandatory
Interoperable FAIR-INTEROP FAIR Interoperability Gate fair mandatory
Reusable FAIR-REUSE FAIR Reusability Gate fair mandatory
Reproducibility (code) REPRO-CODE Code Reproducibility Gate reproducibility mandatory
Reproducibility (data) REPRO-DATA Data Reproducibility Gate reproducibility preferred

Gate Criteria

Each gate has specific criteria that must be satisfied. For example, the FAIR Findability Gate requires:

  1. Persistent identifier assigned (DOI, URI, etc.)
  2. Metadata registered in searchable resource
  3. Identifier resolves to metadata

The Code Reproducibility Gate requires:

  1. Version-controlled source code
  2. Dependency specification (requirements.txt, lock file)
  3. Automated test suite
  4. Build/run instructions

The Data Reproducibility Gate (preferred, not mandatory) requires:

  1. Raw data archived
  2. Processing pipeline scripted
  3. Intermediate results cached and versioned
  4. Random seeds documented

OpenScienceRegistry API

Loading the Registry

The registry loads from the FCC data directory with a single call:

from fcc.governance.open_science import OpenScienceRegistry

registry = OpenScienceRegistry.from_data_dir()

This calls from_yaml() internally, pointing to the governance data files:

# Equivalent explicit loading:
from fcc._resources import get_governance_dir

gov_dir = get_governance_dir()
registry = OpenScienceRegistry.from_yaml(
    templates_path=gov_dir / "open_science_templates.yaml",
    gates_path=gov_dir / "open_science_gates.yaml",
)

Listing Templates

templates = registry.all_templates()
print(f"Total templates: {len(templates)}")
# Total templates: 12

for t in templates:
    print(f"  [{t.id}] {t.name} (category: {t.category})")

Listing Gates

gates = registry.all_gates()
print(f"Total gates: {len(gates)}")
# Total gates: 6

for g in gates:
    print(f"  [{g.id}] {g.name} (severity: {g.severity})")

Filtering by Category

# Templates in a specific category
dm_templates = registry.templates_by_category("data_management")
print(f"Data management templates: {len(dm_templates)}")

# Gates in a specific category
fair_gates = registry.gates_by_category("fair")
print(f"FAIR gates: {len(fair_gates)}")
# 4 gates (FIND, ACCESS, INTEROP, REUSE)

repro_gates = registry.gates_by_category("reproducibility")
print(f"Reproducibility gates: {len(repro_gates)}")
# 2 gates (CODE, DATA)

Filtering by Phase

# Templates applicable to a specific FCC phase
find_templates = registry.templates_for_phase("find")
print(f"Templates for Find phase: {len(find_templates)}")

Looking Up by ID

template = registry.get_template("OPEN-SCI-001")
if template:
    print(f"Template: {template.name}")
    print(f"Description: {template.description}")
    print(f"Checklist items: {len(template.checklist)}")

gate = registry.get_gate("FAIR-FIND")
if gate:
    print(f"Gate: {gate.name}")
    print(f"Criteria: {gate.criteria}")

Mandatory Gates Only

mandatory = registry.mandatory_gates()
print(f"Mandatory gates: {len(mandatory)}")
for g in mandatory:
    print(f"  [{g.id}] {g.name}")

Counts and Containment

print(f"Template count: {registry.template_count()}")
print(f"Gate count: {registry.gate_count()}")
print(f"Total entries: {len(registry)}")

print(f"FAIR-FIND in registry: {'FAIR-FIND' in registry}")
# True

Checklist Generation

The registry can generate markdown checklists from any template.

Basic Checklist

checklist = registry.generate_checklist("OPEN-SCI-001")
print(checklist)

Output:

# Template Name

**Category:** data_management
**Applicable Phases:** find, create

Description of the template.

## Checklist

- [ ] First checklist item
- [ ] Second checklist item
- [ ] Third checklist item

Checklist with Project Name

checklist = registry.generate_checklist(
    "OPEN-SCI-001",
    project_name="FCC Phase 12",
)
print(checklist)

Output includes the project name in the header:

# Template Name -- FCC Phase 12
...

Handling Missing Templates

checklist = registry.generate_checklist("NONEXISTENT")
print(repr(checklist))
# ''  (empty string)

Using Checklists in Workflows

Generate checklists as part of a persona-driven workflow:

# OSC persona generates checklists for all applicable templates
for template in registry.all_templates():
    checklist = registry.generate_checklist(
        template.id,
        project_name="My Research Project",
    )
    # Write to docs/ or attach to workflow deliverable
    output_path = f"docs/checklists/{template.id}.md"
    with open(output_path, "w") as f:
        f.write(checklist)

Compliance Scoring

The evaluate_fair_compliance() method scores how well a project meets FAIR principles.

Basic Evaluation

Pass a dictionary mapping gate IDs to pass/fail booleans:

results = {
    "FAIR-FIND": True,
    "FAIR-ACCESS": True,
    "FAIR-INTEROP": False,
    "FAIR-REUSE": True,
    "REPRO-CODE": True,
    "REPRO-DATA": False,
}

evaluation = registry.evaluate_fair_compliance(results)

print(f"Score: {evaluation['score']:.2f}")
print(f"Compliance level: {evaluation['compliance_level']}")
print(f"Passed gates: {evaluation['passed_gates']}")
print(f"Failed gates: {evaluation['failed_gates']}")
print(f"Total FAIR gates: {evaluation['total_fair_gates']}")

Output:

Score: 0.75
Compliance level: partial
Passed gates: ['FAIR-ACCESS', 'FAIR-FIND', 'FAIR-REUSE', 'REPRO-CODE']
Failed gates: ['FAIR-INTEROP', 'REPRO-DATA']
Total FAIR gates: 4

Compliance Levels

Level Condition
full Score = 1.0 (all FAIR gates passed)
partial 0.0 < Score < 1.0 (some FAIR gates passed)
non-compliant Score = 0.0 (no FAIR gates passed)

Score Calculation

The score is calculated from FAIR-category gates only (not reproducibility):

score = (number of passed FAIR gates) / (number of evaluated FAIR gates)

In the example above: 3 out of 4 FAIR gates passed = 0.75.

Non-FAIR gates (REPRO-CODE, REPRO-DATA) still appear in passed_gates and failed_gates, but they do not affect the FAIR score.

Evaluation Details

The details key provides per-gate information:

for detail in evaluation["details"]:
    status = "PASS" if detail["passed"] else "FAIL"
    print(f"  [{status}] {detail['gate_id']}: {detail['gate_name']} ({detail['category']})")

Output:

  [PASS] FAIR-ACCESS: FAIR Accessibility Gate (fair)
  [PASS] FAIR-FIND: FAIR Findability Gate (fair)
  [FAIL] FAIR-INTEROP: FAIR Interoperability Gate (fair)
  [PASS] FAIR-REUSE: FAIR Reusability Gate (fair)
  [PASS] REPRO-CODE: Code Reproducibility Gate (reproducibility)
  [FAIL] REPRO-DATA: Data Reproducibility Gate (reproducibility)

Full Compliance Example

full_pass = {
    "FAIR-FIND": True,
    "FAIR-ACCESS": True,
    "FAIR-INTEROP": True,
    "FAIR-REUSE": True,
}

evaluation = registry.evaluate_fair_compliance(full_pass)
print(f"Score: {evaluation['score']}")         # 1.0
print(f"Level: {evaluation['compliance_level']}")  # full

Reproducibility Workflow

A step-by-step process for achieving reproducibility using the OSC (Open Science Compliance Officer) persona.

Step 1: Load the Registry

from fcc.governance.open_science import OpenScienceRegistry

registry = OpenScienceRegistry.from_data_dir()

Step 2: Identify Applicable Gates

# Get all mandatory gates
mandatory_gates = registry.mandatory_gates()
print(f"Mandatory gates to satisfy: {len(mandatory_gates)}")

for gate in mandatory_gates:
    print(f"\n  [{gate.id}] {gate.name}")
    print(f"  Severity: {gate.severity}")
    print(f"  Criteria:")
    for criterion in gate.criteria:
        print(f"    - {criterion}")

Step 3: Generate Checklists

# Generate checklists for data management templates
dm_templates = registry.templates_by_category("data_management")
for template in dm_templates:
    checklist = registry.generate_checklist(template.id, project_name="My Project")
    print(checklist)
    print("---")

Step 4: Conduct Assessment

# After completing checklist items, record gate results
gate_results = {}
for gate in registry.all_gates():
    # In a real workflow, this would be populated by the OSC persona
    # after reviewing evidence
    gate_results[gate.id] = True  # or False based on review

evaluation = registry.evaluate_fair_compliance(gate_results)

Step 5: Report Compliance

print(f"\nFAIR Compliance Report")
print(f"{'='*40}")
print(f"Score: {evaluation['score']:.0%}")
print(f"Level: {evaluation['compliance_level']}")
print(f"Passed: {len(evaluation['passed_gates'])} / "
      f"{len(evaluation['passed_gates']) + len(evaluation['failed_gates'])}")

if evaluation["failed_gates"]:
    print(f"\nAction required for:")
    for gate_id in evaluation["failed_gates"]:
        gate = registry.get_gate(gate_id)
        print(f"  - {gate.name if gate else gate_id}")

Compliance dashboard


OSC Persona Audit

The Open Science Compliance Officer (OSC) persona is designed to drive FAIR compliance audits. In the FCC workflow, the OSC persona:

  1. Loads the OpenScienceRegistry at the start of the Critique phase.
  2. Generates checklists for all applicable templates.
  3. Evaluates each gate against project artifacts.
  4. Reports the FAIR compliance score and failed gates.
  5. Publishes GOVERNANCE_GATE_EVALUATED events to the EventBus for each gate result.

Event Integration

from fcc.messaging.bus import EventBus
from fcc.messaging.events import Event, EventType

bus = EventBus()

# OSC persona publishes gate results
for gate_id, passed in gate_results.items():
    bus.publish(Event(
        event_type=(
            EventType.GOVERNANCE_GATE_PASSED if passed
            else EventType.GOVERNANCE_GATE_FAILED
        ),
        source="osc_persona",
        payload={
            "gate_id": gate_id,
            "passed": passed,
            "compliance_level": evaluation["compliance_level"],
        },
    ))

These events flow through the messaging pipeline (EventBus, WebSocket, SSE) and can be monitored in the React frontend's Protocol Explorer. See the Messaging Realtime Demo for details on how events are streamed to the browser.


See Also

  • Messaging Realtime Demo -- event streaming for gate results
  • Distiller Bridge Demo -- data analysis driving compliance checks
  • Web Frontend Guided Demo -- viewing compliance dashboards
  • src/fcc/governance/open_science.py -- full source
  • src/fcc/data/governance/open_science_gates.yaml -- gate definitions
  • src/fcc/data/governance/open_science_templates.yaml -- template definitions