Reproducibility Checklist¶

A reproducibility checklist is a structured, binary-scored verification that an experiment, model, or analysis can be independently reproduced. Complete this checklist before submitting a work product for FCC Critique or before sharing results with any sibling ecosystem. Every item links to the open-science artifact (OPEN-SCI-001 through OPEN-SCI-010) where the supporting detail belongs; the checklist itself is a pointer-rich summary, not a substitute for the underlying documentation.

Template¶

Section 1: Checklist Metadata¶

Instructions: Assign a stable checklist ID, identify the artifact under review, and name the reviewer who will verify the claims. The related-templates table is mandatory — dangling checklists without linked cards are rejected at Critique.

Field	Value
Checklist ID	`[FILL — e.g. REPRO-2026-001]`
Artifact under review	`[FILL]`
Author(s)	`[FILL]`
Reviewer	`[FILL]`
Date completed	`[FILL]`
Preregistration (OPEN-SCI-001)	`[Link / N/A]`
Model Card (OPEN-SCI-004a)	`[Link / N/A]`
Dataset Card (OPEN-SCI-004b)	`[Link / N/A]`

Section 2: Method & Data¶

Instructions: Tick each item only when the corresponding evidence is committed and traceable. Use the Notes field for any partial compliance or deliberate exception.

Pseudocode or algorithm description provided (not just prose)
All assumptions stated formally
Hypotheses distinguished from established facts (no HARKing)
Dataset Card completed for every dataset used
Data splits documented (percentages, stratification method)
No data leakage between train and test verified

Section 3: Experimental Setup¶

Instructions: Pin every stochastic or environment-sensitive input. "Pinned" means a committed file or hash, not an ambient shell environment.

All hyperparameters listed with selection rationale
Random seeds specified for every stochastic operation
Hardware documented (GPU/CPU, memory, accelerator count)
Software versions pinned (lockfile or pyproject.toml)
Environment reproducible via Docker / conda / venv
Configuration files committed (not just CLI flags)

Section 4: Results & Compute¶

Instructions: Confirm uncertainty quantification is present and that compute + carbon footprints are declared.

Error bars or confidence intervals reported for key metrics
Effect sizes reported alongside p-values
Results disaggregated by relevant factors
Negative results documented and not suppressed
Total compute time documented
Energy / carbon estimate provided

Section 5: Ethics & Governance¶

Instructions: These gates must close before promotion — every item traces to an EU AI Act or NIST AI RMF requirement in src/fcc/data/compliance/.

Bias assessment completed (link to Model Card §5)
LLM usage declared (models, prompts, components)
License compliance verified for all dependencies + data
IP / patent implications reviewed
Sensitive-data handling compliant with policy

Section 6: Readiness Assessment¶

Instructions: Compute the total fraction of ticked items and select one of the three readiness labels. Any Not ready result must list blocking items explicitly.

Ready for FCC Critique (>90 % checked, all required items green)
Ready with caveats (>75 % checked, exceptions documented)
Not ready (<75 % checked, blockers listed below)

Blocking items: [FILL — or "None"]

Adoption Checklist¶

All required sections completed
Artifact peer-reviewed by at least one R.I.S.C.E.A.R. peer
Stored in the project's designated docs location
Linked from README or equivalent index
Versioned + date-stamped, with related-template links resolvable

References¶

PHOENIX v4.0.0 — docs/resources/templates/open-science/reproducibility-checklist.md
AAAI-26 Reproducibility Checklist (https://aaai.org/conference/aaai/aaai-26/reproducibility-checklist/)
NeurIPS Paper Checklist — 16 categories (https://neurips.cc/public/guides/PaperChecklist)
Semmelrock, H. et al. (2025) — Reproducibility in ML, AI Magazine
Pineau, J. et al. (2021) — Improving Reproducibility in ML Research, JMLR 22

FCC integration¶

This template is referenced from the Forensic Auditor persona (src/fcc/data/personas/forensic_auditor.yaml) as part of the Critique-phase evidence set. The checklist is the single most frequent audit surface — Forensic Auditors treat any Critique submission without a completed Reproducibility Checklist as a P1 finding. See also the gates in src/fcc/data/governance/open_science_gates.yaml and src/fcc/data/governance/quality_gates.yaml.