Reproducible research environment with Docker, Jupyter, and local Ollama¶
Audience: Researchers and academics who need reproducible AI experiment environments without depending on hosted API availability or incurring per-token costs.
Why this matters¶
Hosted LLM APIs (OpenAI, Anthropic, etc.) are not reproducible by construction:
- Model versions change underneath you (
gpt-4oin March is notgpt-4oin October) - API endpoints have rate limits and outages
- Per-token costs make large-N experiments expensive
- IRB / ethics review may forbid sending research data to third parties
A local Ollama + Docker stack solves all four problems by pinning a specific model tag and running everything offline.
What this tutorial gives you¶
By the end you will have:
- A reproducible Docker stack with FCC + JupyterLab + a pinned local LLM
- A notebook that runs an FCC simulation against the local model
- A scenario JSON file with
ai_configpinning the exact model tag - A pattern for citing the model version in papers
Prerequisites¶
- Docker 24+
- ~10 GB disk for the Docker images and the LLM model weights
- (Optional) NVIDIA GPU with CUDA — Ollama uses it automatically
Step 1: Pin a model¶
The digest pins the exact model weights so the same notebook produces identical outputs months later. (You can find the digest on ollama.com under each model.)
Step 2: Configure FCC to use the pinned model¶
Create .env at the repo root:
FCC_DEFAULT_PROVIDER=ollama
OLLAMA_BASE_URL=http://host.docker.internal:11434/v1
OLLAMA_DEFAULT_MODEL=llama3.1:8b@sha256:xxxxxxxxxxxxxxxx
FCC_LOG_LEVEL=INFO
Step 3: Build and start the stack¶
Open JupyterLab at http://localhost:8888.
Step 4: Write a reproducible notebook¶
# Cell 1: capture the environment
import json
import subprocess
env = {
"fcc_version": __import__("fcc").__version__,
"ollama_model": subprocess.run(
["ollama", "show", "llama3.1:8b", "--modelfile"],
capture_output=True, text=True
).stdout[:500],
"python_version": __import__("sys").version,
}
print(json.dumps(env, indent=2))
# Cell 2: run an FCC simulation against the pinned model
from fcc._resources import get_workflows_dir
from fcc.simulation.ai_client import AIClient
from fcc.simulation.ai_engine import AISimulationEngine
from fcc.workflow.graph import WorkflowGraph
graph = WorkflowGraph.from_json(str(get_workflows_dir() / "base_sequence.json"))
engine = AISimulationEngine(
graph=graph,
ai_client=AIClient(provider="ollama"),
max_steps=20,
ai_config={ # Per-experiment override (overrides .env defaults)
"provider": "ollama",
"model": "llama3.1:8b",
"temperature": 0.0, # Deterministic for reproducibility
"max_tokens": 1024,
},
)
result = engine.run(
start_node="RC",
initial_payload="Design a longitudinal study of...",
scenario_id="STUDY-001",
)
# Cell 3: persist the trace for later reanalysis
from pathlib import Path
import json
Path("traces").mkdir(exist_ok=True)
with open("traces/study-001.json", "w") as f:
json.dump(result.to_dict(), f, indent=2)
Step 5: Pin the scenario in JSON¶
For multi-experiment runs, encode the AI config in the scenario file so it travels with your data:
{
"id": "STUDY-001",
"name": "Longitudinal study workflow",
"type": "ai",
"description": "...",
"objectives": ["..."],
"setup": {
"initial_input": "...",
"start_node": "RC",
"personas_involved": ["RC", "BC", "DE"],
"ai_config": {
"provider": "ollama",
"model": "llama3.1:8b",
"temperature": 0.0,
"max_tokens": 1024
}
},
"validation_rules": []
}
Step 6: Cite in your paper¶
Recommended citation form:
AI inference performed with FCC v1.1.0 REF1 using locally-served Llama 3.1 8B REF2 (digest
sha256:xxx...) via the Ollama OpenAI-compatible API. Temperature was fixed at 0.0 for deterministic output. All scenario configurations, including the exact model digest, are included in the supplementary materials as JSON files following the FCC scenario schema REF3.
Reproducibility checklist¶
- Pinned model digest (not just tag)
- Temperature = 0 in
ai_config - Scenario JSON committed alongside data
- Docker image SHA recorded in environment dump
- Trace JSON files saved for re-analysis
- FCC version and Python version captured
- (Optional) Random seed pinned for any sampling done in your analysis code