Skip to content

LiteLLM (universal router)

LiteLLM is a single Python SDK that proxies to 100+ AI backends via the OpenAI Chat Completions API shape. The FCC LiteLLM plugin registers it as a first-class provider so you can switch backends by changing one environment variable.

Why use LiteLLM

Backend Available via LiteLLM model string
Anthropic Claude anthropic/claude-3-5-sonnet-20241022
OpenAI GPT openai/gpt-4o
Azure OpenAI azure/your-deployment-name
AWS Bedrock bedrock/anthropic.claude-3-sonnet-20240229-v1:0
Google Vertex AI vertex_ai/gemini-1.5-pro
Google Gemini API gemini/gemini-1.5-pro
Cohere cohere/command-r-plus
Mistral API mistral/mistral-large-latest
Hugging Face huggingface/meta-llama/Meta-Llama-3-8B-Instruct
Ollama (local) ollama/llama3.2
vLLM (self-hosted) litellm_proxy/your-model
Together AI together_ai/Qwen/Qwen2.5-72B-Instruct-Turbo
Groq groq/llama-3.1-70b-versatile
... 90 more see LiteLLM provider list

If you want one consistent FCC config that can target any of these without code changes, LiteLLM is the right abstraction.

Install

pip install -e ./plugins/fcc-litellm-plugin

(or make install-dev, which installs the plugin and litellm itself)

5-minute walkthrough

1. Pick a backend

For local development with Ollama already installed:

export LITELLM_DEFAULT_MODEL=ollama/llama3.2
export FCC_DEFAULT_PROVIDER=litellm

For production with hosted Anthropic:

export ANTHROPIC_API_KEY=sk-ant-...
export LITELLM_DEFAULT_MODEL=anthropic/claude-3-5-sonnet-20241022
export FCC_DEFAULT_PROVIDER=litellm

For AWS Bedrock:

export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export LITELLM_DEFAULT_MODEL=bedrock/anthropic.claude-3-sonnet-20240229-v1:0
export FCC_DEFAULT_PROVIDER=litellm

LiteLLM picks up backend credentials from each backend's conventional env vars. See the LiteLLM provider docs for the full list.

2. Run a scenario

fcc scenarios run --scenario basic_routing

The simulation will route every AI call through LiteLLM, which forwards it to whichever backend the model string names.

3. Switch backends without code changes

# Local for development
export LITELLM_DEFAULT_MODEL=ollama/llama3.2

# Cloud for production
export LITELLM_DEFAULT_MODEL=anthropic/claude-3-5-sonnet-20241022

Same fcc scenarios run command works against both.

Programmatic use

from fcc.simulation.ai_client import AIClient
from fcc.plugins.registry import PluginRegistry

registry = PluginRegistry()
registry.discover()

client = AIClient(provider="litellm", plugin_registry=registry)
response = client.complete(
    messages=[{"role": "user", "content": "Hello"}],
    model="anthropic/claude-3-5-sonnet-20241022",   # Switch backends per call
    temperature=0.5,
    max_tokens=200,
)
print(response.content)

Per-scenario backend pinning

Scenarios can override the default backend without touching env vars:

{
  "id": "BENCH-001",
  "name": "Cross-backend benchmark",
  "type": "ai",
  "description": "Compare Claude vs Llama on the same workflow",
  "objectives": ["Measure quality difference"],
  "setup": {
    "initial_input": "Design a REST API for...",
    "start_node": "RC",
    "personas_involved": ["RC", "BC", "DE"],
    "ai_config": {
      "provider": "litellm",
      "model": "bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
      "temperature": 0.3,
      "max_tokens": 2048
    }
  },
  "validation_rules": []
}

Auto-detection rule

The LiteLLM plugin opts in when LITELLM_DEFAULT_MODEL is explicitly set. Just having litellm installed in your virtualenv is not enough — without the env var, FCC stays on its current default provider.

Cost and rate-limit handling

LiteLLM has built-in cost tracking, fallbacks, and rate-limit handling that FCC does not currently surface in its result model. To use those features, configure them via LiteLLM's own config file (~/.litellm/config.yaml) — FCC's plugin will pick them up automatically because it calls litellm.completion(...) directly.

See LiteLLM cost tracking and LiteLLM fallbacks.

See also