LiteLLM (universal router)¶
LiteLLM is a single Python SDK that proxies to 100+ AI backends via the OpenAI Chat Completions API shape. The FCC LiteLLM plugin registers it as a first-class provider so you can switch backends by changing one environment variable.
Why use LiteLLM¶
| Backend | Available via LiteLLM model string |
|---|---|
| Anthropic Claude | anthropic/claude-3-5-sonnet-20241022 |
| OpenAI GPT | openai/gpt-4o |
| Azure OpenAI | azure/your-deployment-name |
| AWS Bedrock | bedrock/anthropic.claude-3-sonnet-20240229-v1:0 |
| Google Vertex AI | vertex_ai/gemini-1.5-pro |
| Google Gemini API | gemini/gemini-1.5-pro |
| Cohere | cohere/command-r-plus |
| Mistral API | mistral/mistral-large-latest |
| Hugging Face | huggingface/meta-llama/Meta-Llama-3-8B-Instruct |
| Ollama (local) | ollama/llama3.2 |
| vLLM (self-hosted) | litellm_proxy/your-model |
| Together AI | together_ai/Qwen/Qwen2.5-72B-Instruct-Turbo |
| Groq | groq/llama-3.1-70b-versatile |
| ... 90 more | see LiteLLM provider list |
If you want one consistent FCC config that can target any of these without code changes, LiteLLM is the right abstraction.
Install¶
(or make install-dev, which installs the plugin and litellm itself)
5-minute walkthrough¶
1. Pick a backend¶
For local development with Ollama already installed:
For production with hosted Anthropic:
export ANTHROPIC_API_KEY=sk-ant-...
export LITELLM_DEFAULT_MODEL=anthropic/claude-3-5-sonnet-20241022
export FCC_DEFAULT_PROVIDER=litellm
For AWS Bedrock:
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export LITELLM_DEFAULT_MODEL=bedrock/anthropic.claude-3-sonnet-20240229-v1:0
export FCC_DEFAULT_PROVIDER=litellm
LiteLLM picks up backend credentials from each backend's conventional env vars. See the LiteLLM provider docs for the full list.
2. Run a scenario¶
The simulation will route every AI call through LiteLLM, which forwards it to whichever backend the model string names.
3. Switch backends without code changes¶
# Local for development
export LITELLM_DEFAULT_MODEL=ollama/llama3.2
# Cloud for production
export LITELLM_DEFAULT_MODEL=anthropic/claude-3-5-sonnet-20241022
Same fcc scenarios run command works against both.
Programmatic use¶
from fcc.simulation.ai_client import AIClient
from fcc.plugins.registry import PluginRegistry
registry = PluginRegistry()
registry.discover()
client = AIClient(provider="litellm", plugin_registry=registry)
response = client.complete(
messages=[{"role": "user", "content": "Hello"}],
model="anthropic/claude-3-5-sonnet-20241022", # Switch backends per call
temperature=0.5,
max_tokens=200,
)
print(response.content)
Per-scenario backend pinning¶
Scenarios can override the default backend without touching env vars:
{
"id": "BENCH-001",
"name": "Cross-backend benchmark",
"type": "ai",
"description": "Compare Claude vs Llama on the same workflow",
"objectives": ["Measure quality difference"],
"setup": {
"initial_input": "Design a REST API for...",
"start_node": "RC",
"personas_involved": ["RC", "BC", "DE"],
"ai_config": {
"provider": "litellm",
"model": "bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
"temperature": 0.3,
"max_tokens": 2048
}
},
"validation_rules": []
}
Auto-detection rule¶
The LiteLLM plugin opts in when LITELLM_DEFAULT_MODEL is explicitly
set. Just having litellm installed in your virtualenv is not enough
— without the env var, FCC stays on its current default provider.
Cost and rate-limit handling¶
LiteLLM has built-in cost tracking, fallbacks, and rate-limit handling
that FCC does not currently surface in its result model. To use those
features, configure them via LiteLLM's own config file
(~/.litellm/config.yaml) — FCC's plugin will pick them up
automatically because it calls litellm.completion(...) directly.
See LiteLLM cost tracking and LiteLLM fallbacks.
See also¶
- Ollama — for the simpler "just run a local model" path
- Provider matrix — auto-detection rules and built-in providers
- LiteLLM upstream docs — full reference