FCC Total Cost of Ownership (TCO)¶

A worked cost model for enterprise FCC deployments. Use it to budget, compare against commercial alternatives, and build an ROI case for your CFO. For security posture see the companion security review.

Cost components¶

FCC's TCO breaks into six components:

Component	Typical share	Who owns it
Licensing	0%	MIT license - no cost
Compute (hosting)	10-40%	Infra / platform team
AI provider tokens	40-80%	Platform + FinOps
Observability & storage	5-15%	SRE
Human support & admin	5-20%	Engineering management
Training & onboarding	2-10%	Learning & Dev

Exact mix depends on whether you use hosted providers (high token spend) or local models via Ollama/vLLM (high compute, near-zero tokens).

Scaling factors¶

Four knobs drive TCO:

Number of personas in active use
Registry load is in-memory and negligible up to ~500 personas.
Simulations per day
Linear driver of token cost.
Concurrent users
Drives the sizing of the backend container / K8s replicas.
Average tokens per call
Workflow size (5-node vs 55-node) multiplies tokens accordingly.

The formula:

daily_token_cost = simulations_per_day
                 * avg_steps_per_sim
                 * avg_tokens_per_step
                 * provider_usd_per_1k_tokens
                 / 1000

Provider price sheet (as of v1.3.3)¶

Provider	Input $/1M	Output $/1M	Notes
Anthropic Claude Opus 4.6	$15	$75	Hosted, high quality
OpenAI GPT-4 class	$10	$30	Hosted
Azure OpenAI	$10	$30	Hosted, private networking extras
Ollama (local)	$0	$0	Infra-only cost
vLLM (local)	$0	$0	Infra-only cost (GPU)
LiteLLM (router)	Pass-through	Pass-through	Adds ops surface
Mock	$0	$0	Dev/test/CI only

Always benchmark your workload; these are list prices.

Worked scenario A: Small team (10 researchers)¶

Profile

10 users, 20 simulations/day each
Average 15 steps/simulation, 2,000 tokens in + 500 tokens out per step
Provider mix: 80% mock (CI + iteration), 20% Anthropic Claude Opus 4.6

Monthly cost

Line	Calculation	$/month
Compute	1x backend container on existing dev server	$0
Tokens	10 * 20 * 15 * 0.2 * 22 workdays * ((2000/1M)$15 + (500/1M)$75)	~$91
Storage	Event logs 5 GB/mo	~$1
Observability	Console/JSON exporter	$0
Total		~$92/mo

ROI: at a conservative $20/hour loaded labor rate, FCC pays back if it saves ~5 hours/month across the team.

Worked scenario B: Mid-size org (100 users)¶

Profile

100 users, 50 simulations/day each
20 steps/sim, 3,000 in + 800 out per step
Provider mix: 50% Anthropic, 30% OpenAI, 20% mock
Deployment: Kubernetes, 3 replicas backend + 2 replicas streamlit

Monthly cost

Line	Calculation	$/month
Compute (K8s, AWS)	5 pods @ 1 vCPU / 2 GB + LB	~$380
Token spend (Anthropic, 50%)	100502022 * 0.5 * ((3000/1M)$15+(800/1M)*$75)	~$5,280
Token spend (OpenAI, 30%)	100502022 * 0.3 * ((3000/1M)$10+(800/1M)*$30)	~$1,075
Storage (S3, 200 GB)		~$6
Observability (managed OTel)		~$120
Total		~$6,860/mo

Cost optimization suggestions:

Move 50% of exploration-phase traffic to Ollama (70B model on a shared GPU node) - projected savings ~$2,600/mo.
Enable scenario caching - deduplicates identical runs; typical hit rate 20-30%.

Worked scenario C: Enterprise (1,000+ users)¶

Profile

1,500 users, 30 simulations/day each
25 steps/sim, 4,000 in + 1,200 out per step
Provider mix: 20% Claude Opus, 10% OpenAI, 70% local vLLM (8x H100)
Deployment: Kubernetes HA, 10 backend replicas, vLLM cluster, multi-region

Monthly cost

Line	Calculation	$/month
Compute - FCC pods	20 pods x 2 vCPU / 4 GB	~$1,900
Compute - vLLM (8x H100, AWS)	730 hrs x $32/hr	~$23,400
Token spend (Claude, 20%)	1500302522 * 0.2 * ((4000/1M)$15+(1200/1M)*$75)	~$37,100
Token spend (OpenAI, 10%)	1500302522 * 0.1 * ((4000/1M)$10+(1200/1M)*$30)	~$7,920
Storage (S3 + long-term)	10 TB	~$230
Observability (managed)		~$1,200
Support staff (2 FTE platform)	Amortized loaded cost	~$35,000
Total		~$106,750/mo

Annualized: ~$1.28M for 1,500 daily-active users = ~$71/user/month.

Cost optimization suggestions:

Shift to on-prem vLLM: swaps opex for capex; typical 12-18 month payback.
Aggressive scenario caching + embedding cache: 10-20% token reduction.
Tier workloads: use cheaper models for FIND phase, premium for CRITIQUE.

Cost optimization strategies¶

Strategy	Typical savings	Effort
Mock mode in CI	100% of CI token cost	Low (default)
Ollama/vLLM for dev + exploration	60-90% tokens	Medium
Scenario deduplication cache	15-30%	Medium
Embedding + retrieval cache	10-20%	Medium
Workflow pruning (smaller graph when tolerable)	20-50%	Low
Temperature=0 + seed-based reuse	5-15%	Low
LiteLLM routing to cheapest-capable model	10-30%	Medium

Comparison vs commercial agent frameworks¶

Framework	Annual license	TCO for scenario B	Notes
FCC (open source MIT)	$0	~$82k/yr	Self-hosted, full control
Commercial Framework X	$80-200k	~$160-300k/yr	Platform fee + per-seat
Commercial Framework Y	$120-400k	~$220-500k/yr	Usage-based on top of platform fee

FCC's "cost" is the engineering hours to run it. For organizations already running modern Python stacks on Kubernetes, that marginal cost is small.

ROI template¶

Estimate ROI over one year:

value_delivered = hours_saved_per_user_month
                * 12
                * avg_loaded_hourly_rate
                * num_users
                * adoption_rate

total_cost = annual_fcc_tco

roi = (value_delivered - total_cost) / total_cost

Example (Scenario B, 100 users, 3 hrs saved/user/mo, $80/hr loaded, 70% adoption):

value_delivered = 3 * 12 * 80 * 100 * 0.7 = $201,600
total_cost      = 6,860 * 12          = $82,320
roi             = (201600 - 82320) / 82320 = 145%

A sub-1-year payback is realistic when FCC replaces ad-hoc prompt engineering with structured persona teams.

FinOps checklist¶

Per-team tags on Kubernetes namespaces for chargeback
Scenario-level token accounting via the event bus (LLM_CALL events)
Monthly cost report pipeline (events -> data warehouse -> dashboard)
Budget alerts per provider
Model-tier policy (which personas may use premium models)
Scheduled caching audit (hit rate, stale entries)
Annual TCO review + provider mix re-evaluation

Security review -- risk posture that complements this cost model
Enterprise deployment -- deployment topologies referenced above
Quickstart: enterprise -- zero-to-running checklist
AI compliance -- regulatory costs (EU AI Act, NIST RMF)
charts/fcc/ -- Helm chart used in scenarios B and C
docker/Dockerfile.backend -- base image used for pod sizing above

FCC Total Cost of Ownership (TCO)¶

Cost components¶

Scaling factors¶

Provider price sheet (as of v1.3.3)¶

Worked scenario A: Small team (10 researchers)¶

Worked scenario B: Mid-size org (100 users)¶

Worked scenario C: Enterprise (1,000+ users)¶

Cost optimization strategies¶

Comparison vs commercial agent frameworks¶

ROI template¶

FinOps checklist¶

Related resources¶