LiteLLM (universal router)¶

LiteLLM is a single Python SDK that proxies to 100+ AI backends via the OpenAI Chat Completions API shape. The FCC LiteLLM plugin registers it as a first-class provider so you can switch backends by changing one environment variable.

Why use LiteLLM¶

Backend	Available via LiteLLM model string
Anthropic Claude	`anthropic/claude-3-5-sonnet-20241022`
OpenAI GPT	`openai/gpt-4o`
Azure OpenAI	`azure/your-deployment-name`
AWS Bedrock	`bedrock/anthropic.claude-3-sonnet-20240229-v1:0`
Google Vertex AI	`vertex_ai/gemini-1.5-pro`
Google Gemini API	`gemini/gemini-1.5-pro`
Cohere	`cohere/command-r-plus`
Mistral API	`mistral/mistral-large-latest`
Hugging Face	`huggingface/meta-llama/Meta-Llama-3-8B-Instruct`
Ollama (local)	`ollama/llama3.2`
vLLM (self-hosted)	`litellm_proxy/your-model`
Together AI	`together_ai/Qwen/Qwen2.5-72B-Instruct-Turbo`
Groq	`groq/llama-3.1-70b-versatile`
... 90 more	see LiteLLM provider list

If you want one consistent FCC config that can target any of these without code changes, LiteLLM is the right abstraction.

Install¶

pip install -e ./plugins/fcc-litellm-plugin

(or make install-dev, which installs the plugin and litellm itself)

5-minute walkthrough¶

1. Pick a backend¶

For local development with Ollama already installed:

export LITELLM_DEFAULT_MODEL=ollama/llama3.2
export FCC_DEFAULT_PROVIDER=litellm

For production with hosted Anthropic:

export ANTHROPIC_API_KEY=sk-ant-...
export LITELLM_DEFAULT_MODEL=anthropic/claude-3-5-sonnet-20241022
export FCC_DEFAULT_PROVIDER=litellm

For AWS Bedrock:

export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export LITELLM_DEFAULT_MODEL=bedrock/anthropic.claude-3-sonnet-20240229-v1:0
export FCC_DEFAULT_PROVIDER=litellm

LiteLLM picks up backend credentials from each backend's conventional env vars. See the LiteLLM provider docs for the full list.

2. Run a scenario¶

fcc scenarios run --scenario basic_routing

The simulation will route every AI call through LiteLLM, which forwards it to whichever backend the model string names.

3. Switch backends without code changes¶

# Local for development
export LITELLM_DEFAULT_MODEL=ollama/llama3.2

# Cloud for production
export LITELLM_DEFAULT_MODEL=anthropic/claude-3-5-sonnet-20241022

Same fcc scenarios run command works against both.

Programmatic use¶

from fcc.simulation.ai_client import AIClient
from fcc.plugins.registry import PluginRegistry

registry = PluginRegistry()
registry.discover()

client = AIClient(provider="litellm", plugin_registry=registry)
response = client.complete(
    messages=[{"role": "user", "content": "Hello"}],
    model="anthropic/claude-3-5-sonnet-20241022",   # Switch backends per call
    temperature=0.5,
    max_tokens=200,
)
print(response.content)

Per-scenario backend pinning¶

Scenarios can override the default backend without touching env vars:

{
  "id": "BENCH-001",
  "name": "Cross-backend benchmark",
  "type": "ai",
  "description": "Compare Claude vs Llama on the same workflow",
  "objectives": ["Measure quality difference"],
  "setup": {
    "initial_input": "Design a REST API for...",
    "start_node": "RC",
    "personas_involved": ["RC", "BC", "DE"],
    "ai_config": {
      "provider": "litellm",
      "model": "bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
      "temperature": 0.3,
      "max_tokens": 2048
    }
  },
  "validation_rules": []
}

Auto-detection rule¶

The LiteLLM plugin opts in when LITELLM_DEFAULT_MODEL is explicitly set. Just having litellm installed in your virtualenv is not enough — without the env var, FCC stays on its current default provider.

Cost and rate-limit handling¶

LiteLLM has built-in cost tracking, fallbacks, and rate-limit handling that FCC does not currently surface in its result model. To use those features, configure them via LiteLLM's own config file (~/.litellm/config.yaml) — FCC's plugin will pick them up automatically because it calls litellm.completion(...) directly.

See LiteLLM cost tracking and LiteLLM fallbacks.