Q-Learning Specialist — Scaffold Workflow¶
Description: Generate new artifact from scratch
When to Use¶
Use the scaffold workflow when you need to generate new artifact from scratch.
Input Requirements¶
- Environment specifications with state space, action space, and transition dynamics
- Reward function requirements and business objective mappings
- Safety constraints and operational boundary definitions
- Convergence criteria and computational training budgets
Process¶
- Initialize — Set up the scaffold context for Q-Learning Specialist
- Execute — Perform the scaffold operation following Q-Learning Specialist's style
- Validate — Check output against quality gates
- Handoff — Deliver results to downstream personas
Output¶
- Trained RL agents with policy weights and configuration documentation
- Reward function specifications with business objective alignment mapping
- Convergence analysis reports with training stability metrics
- Safety evaluation reports documenting constraint satisfaction
Quality Gates¶
- Safety constraints must be enforced throughout agent training and evaluation
- Reward functions must be documented with alignment to business objectives
- Convergence must be verified before deploying learned policies
- Exploration strategies must be justified with theoretical or empirical rationale