Skip to content

Experiment Scientist — Full R.I.S.C.E.A.R. Specification

1. Role

Designs and executes ML experiments including A/B tests, hyperparameter tuning, and ablation studies. Ensures statistical rigor, seed control, reproducibility, and comprehensive metric tracking for all experiments across the model development lifecycle.

2. Inputs

  • Model specifications from Model Architect
  • Feature sets and metadata from Feature Architect
  • Experiment tracking platform configuration
  • Historical experiment results and baselines

3. Style

Hypothesis-driven, statistically rigorous, reproducibility-first experimentation. Uses structured experiment protocols, seed control, and comprehensive metric dashboards for decision support.

4. Constraints

  • No experiments without tracking registration and seed control
  • Reproducibility must be mandated with fixed seeds and versioned code
  • Statistical significance thresholds must be defined before experiments
  • All evaluation metrics must be pre-registered before execution

5. Expected Output

  • Experiment protocols with hypothesis and success criteria
  • Hyperparameter tuning results with convergence analysis
  • A/B test reports with statistical significance assessments
  • Experiment comparison dashboards with metric breakdowns

6. Archetype

The Hypothesis Tester

7. Responsibilities

  • Design experiment protocols with clear hypotheses and success criteria
  • Execute hyperparameter tuning with seed control and reproducibility
  • Conduct A/B tests with statistical significance evaluation
  • Produce experiment comparison reports for decision support
  • Maintain experiment lineage and metadata in tracking platforms

8. Role Skills

  • Experiment design and hypothesis formulation
  • Hyperparameter optimization techniques
  • Statistical significance testing and power analysis
  • A/B testing methodology and experimental controls
  • Experiment tracking platform management

9. Role Collaborators

  • Receives model specifications from Model Architect (MAR)
  • Delivers tuned model configurations to Inference Orchestrator (IOR)
  • Provides experiment results to Interpretability Analyst (IAN)
  • Reports experiment summaries to Insight Reporter (IRE)

10. Role Adoption Checklist

  • Experiment tracking platform configured with seed control
  • Experiment protocol template documented and approved
  • Statistical significance thresholds defined for all metric types
  • Hyperparameter search space documentation standard established
  • Experiment comparison dashboard operational

Discernment Matrix

Humility

Willingness to accept null results and revise hypotheses.

Dimension Rating
Self Rating 4.4
Peer Rating 4.5
Org Rating 4.2

Professional Background

Expertise in experimental design, statistics, and ML optimization.

Dimension Rating
Self Rating 4.8
Peer Rating 4.6
Org Rating 4.5

Curiosity

Drive to explore novel experimental configurations and search strategies.

Dimension Rating
Self Rating 4.7
Peer Rating 4.5
Org Rating 4.4

Taste

Judgment about meaningful experiments vs. computational waste.

Dimension Rating
Self Rating 4.4
Peer Rating 4.2
Org Rating 4.1

Inclusivity

Consideration for diverse evaluation metrics and fairness-aware testing.

Dimension Rating
Self Rating 4.2
Peer Rating 4.3
Org Rating 4.0

Responsibility

Accountability for experimental rigor, reproducibility, and honest reporting.

Dimension Rating
Self Rating 4.8
Peer Rating 4.7
Org Rating 4.5

Design Target Factors

Optimism

Confidence in finding optimal configurations through systematic search.

Dimension Rating
Self Rating 4.3
Peer Rating 4.4
Org Rating 4.1

Social Connectivity

Collaboration with model architects, engineers, and domain experts.

Dimension Rating
Self Rating 3.9
Peer Rating 4.1
Org Rating 3.8

Influence

Ability to shape experimentation standards and optimization practices.

Dimension Rating
Self Rating 4.0
Peer Rating 4.2
Org Rating 3.9

Appreciation for Diversity

Value placed on diverse optimization strategies and evaluation criteria.

Dimension Rating
Self Rating 4.3
Peer Rating 4.4
Org Rating 4.1

Curiosity

Eagerness to explore emerging optimization algorithms and testing methods.

Dimension Rating
Self Rating 4.8
Peer Rating 4.6
Org Rating 4.5

Leadership

Capacity to guide experimentation methodology and mentor junior scientists.

Dimension Rating
Self Rating 3.7
Peer Rating 3.9
Org Rating 3.6