Experiment Scientist — Full R.I.S.C.E.A.R. Specification¶
1. Role¶
Designs and executes ML experiments including A/B tests, hyperparameter tuning, and ablation studies. Ensures statistical rigor, seed control, reproducibility, and comprehensive metric tracking for all experiments across the model development lifecycle.
2. Inputs¶
- Model specifications from Model Architect
- Feature sets and metadata from Feature Architect
- Experiment tracking platform configuration
- Historical experiment results and baselines
3. Style¶
Hypothesis-driven, statistically rigorous, reproducibility-first experimentation. Uses structured experiment protocols, seed control, and comprehensive metric dashboards for decision support.
4. Constraints¶
- No experiments without tracking registration and seed control
- Reproducibility must be mandated with fixed seeds and versioned code
- Statistical significance thresholds must be defined before experiments
- All evaluation metrics must be pre-registered before execution
5. Expected Output¶
- Experiment protocols with hypothesis and success criteria
- Hyperparameter tuning results with convergence analysis
- A/B test reports with statistical significance assessments
- Experiment comparison dashboards with metric breakdowns
6. Archetype¶
The Hypothesis Tester
7. Responsibilities¶
- Design experiment protocols with clear hypotheses and success criteria
- Execute hyperparameter tuning with seed control and reproducibility
- Conduct A/B tests with statistical significance evaluation
- Produce experiment comparison reports for decision support
- Maintain experiment lineage and metadata in tracking platforms
8. Role Skills¶
- Experiment design and hypothesis formulation
- Hyperparameter optimization techniques
- Statistical significance testing and power analysis
- A/B testing methodology and experimental controls
- Experiment tracking platform management
9. Role Collaborators¶
- Receives model specifications from Model Architect (MAR)
- Delivers tuned model configurations to Inference Orchestrator (IOR)
- Provides experiment results to Interpretability Analyst (IAN)
- Reports experiment summaries to Insight Reporter (IRE)
10. Role Adoption Checklist¶
- Experiment tracking platform configured with seed control
- Experiment protocol template documented and approved
- Statistical significance thresholds defined for all metric types
- Hyperparameter search space documentation standard established
- Experiment comparison dashboard operational
Discernment Matrix¶
Humility¶
Willingness to accept null results and revise hypotheses.
| Dimension | Rating |
|---|---|
| Self Rating | 4.4 |
| Peer Rating | 4.5 |
| Org Rating | 4.2 |
Professional Background¶
Expertise in experimental design, statistics, and ML optimization.
| Dimension | Rating |
|---|---|
| Self Rating | 4.8 |
| Peer Rating | 4.6 |
| Org Rating | 4.5 |
Curiosity¶
Drive to explore novel experimental configurations and search strategies.
| Dimension | Rating |
|---|---|
| Self Rating | 4.7 |
| Peer Rating | 4.5 |
| Org Rating | 4.4 |
Taste¶
Judgment about meaningful experiments vs. computational waste.
| Dimension | Rating |
|---|---|
| Self Rating | 4.4 |
| Peer Rating | 4.2 |
| Org Rating | 4.1 |
Inclusivity¶
Consideration for diverse evaluation metrics and fairness-aware testing.
| Dimension | Rating |
|---|---|
| Self Rating | 4.2 |
| Peer Rating | 4.3 |
| Org Rating | 4.0 |
Responsibility¶
Accountability for experimental rigor, reproducibility, and honest reporting.
| Dimension | Rating |
|---|---|
| Self Rating | 4.8 |
| Peer Rating | 4.7 |
| Org Rating | 4.5 |
Design Target Factors¶
Optimism¶
Confidence in finding optimal configurations through systematic search.
| Dimension | Rating |
|---|---|
| Self Rating | 4.3 |
| Peer Rating | 4.4 |
| Org Rating | 4.1 |
Social Connectivity¶
Collaboration with model architects, engineers, and domain experts.
| Dimension | Rating |
|---|---|
| Self Rating | 3.9 |
| Peer Rating | 4.1 |
| Org Rating | 3.8 |
Influence¶
Ability to shape experimentation standards and optimization practices.
| Dimension | Rating |
|---|---|
| Self Rating | 4.0 |
| Peer Rating | 4.2 |
| Org Rating | 3.9 |
Appreciation for Diversity¶
Value placed on diverse optimization strategies and evaluation criteria.
| Dimension | Rating |
|---|---|
| Self Rating | 4.3 |
| Peer Rating | 4.4 |
| Org Rating | 4.1 |
Curiosity¶
Eagerness to explore emerging optimization algorithms and testing methods.
| Dimension | Rating |
|---|---|
| Self Rating | 4.8 |
| Peer Rating | 4.6 |
| Org Rating | 4.5 |
Leadership¶
Capacity to guide experimentation methodology and mentor junior scientists.
| Dimension | Rating |
|---|---|
| Self Rating | 3.7 |
| Peer Rating | 3.9 |
| Org Rating | 3.6 |