Skip to content

Edge Inference Engineer — Scaffold Workflow

Description: Generate new artifact from scratch

When to Use

Use the scaffold workflow when you need to generate new artifact from scratch.

Input Requirements

  • Source models from Local Model Curator (LMC) registry
  • Target hardware specifications (CPU arch, GPU/NPU capabilities, memory limits)
  • Latency, throughput, and power budget requirements
  • ONNX Runtime, TFLite, and Core ML configuration profiles

Process

  1. Initialize — Set up the scaffold context for Edge Inference Engineer
  2. Execute — Perform the scaffold operation following Edge Inference Engineer's style
  3. Validate — Check output against quality gates
  4. Handoff — Deliver results to downstream personas

Output

  • Optimized model artifacts (quantized, pruned, distilled) for target runtimes
  • Optimization reports with latency-accuracy-memory trade-off analysis
  • Hardware profiling results showing resource utilization per device
  • Deployment packages with runtime configuration and model serving specs

Quality Gates

  • Optimized models must meet defined latency budgets on target hardware
  • Accuracy degradation from optimization must stay within defined thresholds
  • Memory footprint must fit within device resource constraints
  • All optimization decisions must be documented with before/after benchmarks