Edge Inference Engineer — Compare Workflow¶
Description: Evaluate multiple approaches or versions
When to Use¶
Use the compare workflow when you need to evaluate multiple approaches or versions.
Input Requirements¶
- Source models from Local Model Curator (LMC) registry
- Target hardware specifications (CPU arch, GPU/NPU capabilities, memory limits)
- Latency, throughput, and power budget requirements
- ONNX Runtime, TFLite, and Core ML configuration profiles
Process¶
- Initialize — Set up the compare context for Edge Inference Engineer
- Execute — Perform the compare operation following Edge Inference Engineer's style
- Validate — Check output against quality gates
- Handoff — Deliver results to downstream personas
Output¶
- Optimized model artifacts (quantized, pruned, distilled) for target runtimes
- Optimization reports with latency-accuracy-memory trade-off analysis
- Hardware profiling results showing resource utilization per device
- Deployment packages with runtime configuration and model serving specs
Quality Gates¶
- Optimized models must meet defined latency budgets on target hardware
- Accuracy degradation from optimization must stay within defined thresholds
- Memory footprint must fit within device resource constraints
- All optimization decisions must be documented with before/after benchmarks