Chapter 8: The Collaboration Engine — Human-in-the-Loop Design¶
The FCC framework is not a fully autonomous pipeline. It is designed around a core assumption: human judgment remains essential at critical decision points. The Collaboration Engine codifies this assumption into a structured system of sessions, turns, approval gates, scoring, and progress tracking.
This chapter explains the Human-in-the-Loop (HITL) architecture, the frozen dataclass model that underpins it, and how to run a collaboration session from creation through completion.
The state diagram below shows the five states a collaboration session can occupy and the transitions — including approval-gate pauses and aborts — that move it between them.
stateDiagram-v2
[*] --> CREATED: create_session()
CREATED --> ACTIVE: start_session()
ACTIVE --> ACTIVE: add_turn()
ACTIVE --> PAUSED: gate requires review
PAUSED --> ACTIVE: human approves
PAUSED --> ACTIVE: human provides feedback
ACTIVE --> COMPLETED: complete_session()
ACTIVE --> ABORTED: abort_session()
PAUSED --> ABORTED: abort_session()
COMPLETED --> [*]
ABORTED --> [*]
This tightly bounded state space is what makes session recordings safe to replay and audit.
What Is Human-in-the-Loop?¶
Human-in-the-Loop is a design pattern where automated agent outputs pass through human review checkpoints before advancing to downstream stages. In FCC, the HITL model serves three purposes:
- Quality assurance — humans evaluate deliverables against rubrics before they propagate.
- Ethical oversight — hard-stop constitution rules can require human sign-off on sensitive outputs.
- Knowledge transfer — the back-and-forth between agent and human creates an auditable record of reasoning and decisions.
The collaboration subsystem lives in src/fcc/collaboration/ and comprises five
modules: models, engine, scoring, progress, recording, and context.
The Session Model: 11 Frozen Dataclasses¶
All collaboration state is represented by frozen (immutable) dataclasses. Immutability makes sessions safe to serialize, replay, and share across threads without defensive copying. The 11 dataclasses are:
| Dataclass | Purpose |
|---|---|
SessionStatus |
Enum: CREATED, ACTIVE, PAUSED, COMPLETED, ABORTED |
TurnType |
Enum: HUMAN, AGENT, SYSTEM |
ApprovalDecision |
Enum: APPROVED, REJECTED, NEEDS_REVISION, DEFERRED |
ApprovalGate |
Checkpoint linking a workflow node to a required score |
QualityScore |
Score (1-5) with rubric breakdown and justification |
AgentCapabilityRating |
Derived rating from approval/quality history |
SessionTurn |
One turn: who spoke, what they said, optional score |
HandoffProtocol |
Rules for turn-taking (max consecutive, auto-approve threshold) |
CollaborationSession |
The complete session snapshot |
ProgressState |
Completion tracking for an entity |
| (Enums counted above) |
Each dataclass exposes to_dict() and from_dict() class methods, making
serialization straightforward:
from fcc.collaboration.models import CollaborationSession
session = CollaborationSession(session_id="s-001", workflow_id="base_5")
data = session.to_dict()
restored = CollaborationSession.from_dict(data)
assert restored.session_id == "s-001"
CollaborationEngine Lifecycle¶
The CollaborationEngine manages the mutable state behind the scenes while
exposing only frozen CollaborationSession snapshots to callers. The lifecycle
follows five stages:
- Create —
engine.create_session(workflow_id, participants, gates, protocol)initializes a_MutableSessionand returns a frozen snapshot. - Start —
engine.start_session(session_id)transitions the session fromCREATEDtoACTIVE. - Turn —
engine.add_turn(session_id, turn_type, actor, content)appends aSessionTurn. Turns alternate between human and agent, governed by theHandoffProtocol. - Gate —
engine.evaluate_gate(session_id, gate_id, score, scorer)runs the scoring engine at a checkpoint. If the score meets the threshold, the gate passes; otherwise the session may pause for revision. - Complete —
engine.complete_session(session_id)finalizes the session.
The engine optionally accepts an EventBus instance, emitting events such as
SESSION_CREATED, TURN_ADDED, GATE_EVALUATED, and SESSION_COMPLETED for
downstream subscribers.
ScoringEngine and Quality Evaluation¶
The ScoringEngine is the quantitative backbone of collaboration. It provides
two methods:
score_deliverable()— records aQualityScore(1-5) with optional per-rubric breakdown and justification text.evaluate_at_gate()— combines scoring with anApprovalGateto produce anApprovalDecision. If the score exceedsgate.required_score, the decision isAPPROVED; otherwiseNEEDS_REVISION.
The engine also maintains a running AgentCapabilityRating per persona,
aggregating approval rates and average quality scores into an effective_rating()
that blends both dimensions.
from fcc.collaboration.scoring import ScoringEngine
from fcc.collaboration.models import ApprovalGate
engine = ScoringEngine()
gate = ApprovalGate(gate_id="g-1", workflow_node_id="n-1", required_score=3.5)
decision, qs = engine.evaluate_at_gate(gate, "doc-001", "reviewer", 4.0)
# decision == ApprovalDecision.APPROVED
Approval Gates¶
An ApprovalGate binds a workflow node to a quality threshold. Gates are defined
at session creation time and referenced by ID during evaluation:
gate_id— unique identifier (e.g.,"gate-find-review").workflow_node_id— the workflow graph node this gate is attached to.required_score— minimum score (default 3.0) to pass.requires_human— ifTrue, auto-approval viaHandoffProtocolis disabled for this gate.rubric— tuple of evaluation criteria strings.
Gates enforce that no deliverable advances without meeting its quality bar. When
a gate is attached to a constitution hard-stop node, requires_human should
always be True.
ProgressTracker¶
The ProgressTracker maintains ProgressState objects for sessions, workflows,
and other entities. Each state records:
total_stepsandcompleted_steps- A computed
percentageproperty - A
statusfield (pending,in_progress,completed)
The tracker is useful for CLI dashboards (see Chapter 7) and for monitoring long-running collaboration sessions.
SessionRecorder: Save, Load, and Replay¶
The SessionRecorder persists sessions to JSON and replays them:
save(session, path)— writes the frozen session snapshot to a JSON file.load(path)— restores aCollaborationSessionfrom JSON.replay(session, event_bus)— re-emits the session's turns as events on anEventBus, enabling post-hoc analysis and integration testing.
This makes collaboration sessions fully reproducible. A recorded session from a design sprint can be loaded months later, replayed through subscribers, and analyzed without re-running the workflow.
SharedContext: Auditable Key-Value Workspace¶
The SharedContext is a key-value store shared across all turns in a session.
Every set() and delete() operation records:
- The old and new values
- The actor who made the change
- A UTC timestamp
This audit trail ensures transparency. If a human overrides an agent's context value, the change is recorded and can be reviewed later.
from fcc.collaboration.context import SharedContext
ctx = SharedContext()
ctx.set("objective", "Evaluate data pipeline design", actor="human-alice")
ctx.set("objective", "Evaluate and optimize data pipeline", actor="agent-BC")
# ctx._history has 2 entries, showing the progression
The SharedContext is embedded in the CollaborationSession snapshot via
to_dict(), so it persists across save/load cycles.
Putting It All Together¶
A typical HITL workflow looks like this:
- Create a session with gates at the Find-to-Create and Create-to-Critique transitions.
- Start the session and add agent turns as the simulation runs.
- At each gate, evaluate the deliverable and record the decision.
- If a gate returns
NEEDS_REVISION, add human feedback as a turn, then resume agent processing. - Complete the session and save the recording.
The collaboration engine integrates with the plugin architecture (see Chapter 6), the event bus (see Chapter 7), and the governance layer (see Chapter 9).
Key Takeaways
- The collaboration engine uses 11 frozen dataclasses for safe, serializable state.
- Sessions follow a create-start-turn-gate-complete lifecycle.
- The ScoringEngine quantifies deliverable quality and drives approval decisions.
- SharedContext provides an auditable key-value workspace for cross-turn data sharing.
- SessionRecorder enables full save/load/replay of collaboration sessions.