Reproducibility Checklist¶
A reproducibility checklist is a structured, binary-scored verification that an experiment, model, or analysis can be independently reproduced. Complete this checklist before submitting a work product for FCC Critique or before sharing results with any sibling ecosystem. Every item links to the open-science artifact (OPEN-SCI-001 through OPEN-SCI-010) where the supporting detail belongs; the checklist itself is a pointer-rich summary, not a substitute for the underlying documentation.
Template¶
Section 1: Checklist Metadata¶
Instructions: Assign a stable checklist ID, identify the artifact under review, and name the reviewer who will verify the claims. The related-templates table is mandatory — dangling checklists without linked cards are rejected at Critique.
| Field | Value |
|---|---|
| Checklist ID | [FILL — e.g. REPRO-2026-001] |
| Artifact under review | [FILL] |
| Author(s) | [FILL] |
| Reviewer | [FILL] |
| Date completed | [FILL] |
| Preregistration (OPEN-SCI-001) | [Link / N/A] |
| Model Card (OPEN-SCI-004a) | [Link / N/A] |
| Dataset Card (OPEN-SCI-004b) | [Link / N/A] |
Section 2: Method & Data¶
Instructions: Tick each item only when the corresponding evidence is committed and traceable. Use the Notes field for any partial compliance or deliberate exception.
- Pseudocode or algorithm description provided (not just prose)
- All assumptions stated formally
- Hypotheses distinguished from established facts (no HARKing)
- Dataset Card completed for every dataset used
- Data splits documented (percentages, stratification method)
- No data leakage between train and test verified
Section 3: Experimental Setup¶
Instructions: Pin every stochastic or environment-sensitive input. "Pinned" means a committed file or hash, not an ambient shell environment.
- All hyperparameters listed with selection rationale
- Random seeds specified for every stochastic operation
- Hardware documented (GPU/CPU, memory, accelerator count)
- Software versions pinned (lockfile or pyproject.toml)
- Environment reproducible via Docker / conda / venv
- Configuration files committed (not just CLI flags)
Section 4: Results & Compute¶
Instructions: Confirm uncertainty quantification is present and that compute + carbon footprints are declared.
- Error bars or confidence intervals reported for key metrics
- Effect sizes reported alongside p-values
- Results disaggregated by relevant factors
- Negative results documented and not suppressed
- Total compute time documented
- Energy / carbon estimate provided
Section 5: Ethics & Governance¶
Instructions: These gates must close before promotion — every item traces to an EU AI Act or NIST AI RMF requirement in
src/fcc/data/compliance/.
- Bias assessment completed (link to Model Card §5)
- LLM usage declared (models, prompts, components)
- License compliance verified for all dependencies + data
- IP / patent implications reviewed
- Sensitive-data handling compliant with policy
Section 6: Readiness Assessment¶
Instructions: Compute the total fraction of ticked items and select one of the three readiness labels. Any Not ready result must list blocking items explicitly.
- Ready for FCC Critique (>90 % checked, all required items green)
- Ready with caveats (>75 % checked, exceptions documented)
- Not ready (<75 % checked, blockers listed below)
Blocking items: [FILL — or "None"]
Adoption Checklist¶
- All required sections completed
- Artifact peer-reviewed by at least one R.I.S.C.E.A.R. peer
- Stored in the project's designated docs location
- Linked from README or equivalent index
- Versioned + date-stamped, with related-template links resolvable
References¶
- PHOENIX v4.0.0 —
docs/resources/templates/open-science/reproducibility-checklist.md - AAAI-26 Reproducibility Checklist (https://aaai.org/conference/aaai/aaai-26/reproducibility-checklist/)
- NeurIPS Paper Checklist — 16 categories (https://neurips.cc/public/guides/PaperChecklist)
- Semmelrock, H. et al. (2025) — Reproducibility in ML, AI Magazine
- Pineau, J. et al. (2021) — Improving Reproducibility in ML Research, JMLR 22
FCC integration¶
This template is referenced from the Forensic Auditor persona
(src/fcc/data/personas/forensic_auditor.yaml) as part of the
Critique-phase evidence set. The checklist is the single most frequent
audit surface — Forensic Auditors treat any Critique submission without
a completed Reproducibility Checklist as a P1 finding. See also the
gates in src/fcc/data/governance/open_science_gates.yaml and
src/fcc/data/governance/quality_gates.yaml.