Skip to content

Chapter 8: The Collaboration Engine — Human-in-the-Loop Design

The FCC framework is not a fully autonomous pipeline. It is designed around a core assumption: human judgment remains essential at critical decision points. The Collaboration Engine codifies this assumption into a structured system of sessions, turns, approval gates, scoring, and progress tracking.

This chapter explains the Human-in-the-Loop (HITL) architecture, the frozen dataclass model that underpins it, and how to run a collaboration session from creation through completion.

The state diagram below shows the five states a collaboration session can occupy and the transitions — including approval-gate pauses and aborts — that move it between them.

stateDiagram-v2
    [*] --> CREATED: create_session()
    CREATED --> ACTIVE: start_session()
    ACTIVE --> ACTIVE: add_turn()
    ACTIVE --> PAUSED: gate requires review
    PAUSED --> ACTIVE: human approves
    PAUSED --> ACTIVE: human provides feedback
    ACTIVE --> COMPLETED: complete_session()
    ACTIVE --> ABORTED: abort_session()
    PAUSED --> ABORTED: abort_session()
    COMPLETED --> [*]
    ABORTED --> [*]

This tightly bounded state space is what makes session recordings safe to replay and audit.

What Is Human-in-the-Loop?

Human-in-the-Loop is a design pattern where automated agent outputs pass through human review checkpoints before advancing to downstream stages. In FCC, the HITL model serves three purposes:

  1. Quality assurance — humans evaluate deliverables against rubrics before they propagate.
  2. Ethical oversight — hard-stop constitution rules can require human sign-off on sensitive outputs.
  3. Knowledge transfer — the back-and-forth between agent and human creates an auditable record of reasoning and decisions.

The collaboration subsystem lives in src/fcc/collaboration/ and comprises five modules: models, engine, scoring, progress, recording, and context.

The Session Model: 11 Frozen Dataclasses

All collaboration state is represented by frozen (immutable) dataclasses. Immutability makes sessions safe to serialize, replay, and share across threads without defensive copying. The 11 dataclasses are:

Dataclass Purpose
SessionStatus Enum: CREATED, ACTIVE, PAUSED, COMPLETED, ABORTED
TurnType Enum: HUMAN, AGENT, SYSTEM
ApprovalDecision Enum: APPROVED, REJECTED, NEEDS_REVISION, DEFERRED
ApprovalGate Checkpoint linking a workflow node to a required score
QualityScore Score (1-5) with rubric breakdown and justification
AgentCapabilityRating Derived rating from approval/quality history
SessionTurn One turn: who spoke, what they said, optional score
HandoffProtocol Rules for turn-taking (max consecutive, auto-approve threshold)
CollaborationSession The complete session snapshot
ProgressState Completion tracking for an entity
(Enums counted above)

Each dataclass exposes to_dict() and from_dict() class methods, making serialization straightforward:

from fcc.collaboration.models import CollaborationSession

session = CollaborationSession(session_id="s-001", workflow_id="base_5")
data = session.to_dict()
restored = CollaborationSession.from_dict(data)
assert restored.session_id == "s-001"

CollaborationEngine Lifecycle

The CollaborationEngine manages the mutable state behind the scenes while exposing only frozen CollaborationSession snapshots to callers. The lifecycle follows five stages:

  1. Createengine.create_session(workflow_id, participants, gates, protocol) initializes a _MutableSession and returns a frozen snapshot.
  2. Startengine.start_session(session_id) transitions the session from CREATED to ACTIVE.
  3. Turnengine.add_turn(session_id, turn_type, actor, content) appends a SessionTurn. Turns alternate between human and agent, governed by the HandoffProtocol.
  4. Gateengine.evaluate_gate(session_id, gate_id, score, scorer) runs the scoring engine at a checkpoint. If the score meets the threshold, the gate passes; otherwise the session may pause for revision.
  5. Completeengine.complete_session(session_id) finalizes the session.

The engine optionally accepts an EventBus instance, emitting events such as SESSION_CREATED, TURN_ADDED, GATE_EVALUATED, and SESSION_COMPLETED for downstream subscribers.

ScoringEngine and Quality Evaluation

The ScoringEngine is the quantitative backbone of collaboration. It provides two methods:

  • score_deliverable() — records a QualityScore (1-5) with optional per-rubric breakdown and justification text.
  • evaluate_at_gate() — combines scoring with an ApprovalGate to produce an ApprovalDecision. If the score exceeds gate.required_score, the decision is APPROVED; otherwise NEEDS_REVISION.

The engine also maintains a running AgentCapabilityRating per persona, aggregating approval rates and average quality scores into an effective_rating() that blends both dimensions.

from fcc.collaboration.scoring import ScoringEngine
from fcc.collaboration.models import ApprovalGate

engine = ScoringEngine()
gate = ApprovalGate(gate_id="g-1", workflow_node_id="n-1", required_score=3.5)
decision, qs = engine.evaluate_at_gate(gate, "doc-001", "reviewer", 4.0)
# decision == ApprovalDecision.APPROVED

Approval Gates

An ApprovalGate binds a workflow node to a quality threshold. Gates are defined at session creation time and referenced by ID during evaluation:

  • gate_id — unique identifier (e.g., "gate-find-review").
  • workflow_node_id — the workflow graph node this gate is attached to.
  • required_score — minimum score (default 3.0) to pass.
  • requires_human — if True, auto-approval via HandoffProtocol is disabled for this gate.
  • rubric — tuple of evaluation criteria strings.

Gates enforce that no deliverable advances without meeting its quality bar. When a gate is attached to a constitution hard-stop node, requires_human should always be True.

ProgressTracker

The ProgressTracker maintains ProgressState objects for sessions, workflows, and other entities. Each state records:

  • total_steps and completed_steps
  • A computed percentage property
  • A status field (pending, in_progress, completed)

The tracker is useful for CLI dashboards (see Chapter 7) and for monitoring long-running collaboration sessions.

SessionRecorder: Save, Load, and Replay

The SessionRecorder persists sessions to JSON and replays them:

  • save(session, path) — writes the frozen session snapshot to a JSON file.
  • load(path) — restores a CollaborationSession from JSON.
  • replay(session, event_bus) — re-emits the session's turns as events on an EventBus, enabling post-hoc analysis and integration testing.

This makes collaboration sessions fully reproducible. A recorded session from a design sprint can be loaded months later, replayed through subscribers, and analyzed without re-running the workflow.

SharedContext: Auditable Key-Value Workspace

The SharedContext is a key-value store shared across all turns in a session. Every set() and delete() operation records:

  • The old and new values
  • The actor who made the change
  • A UTC timestamp

This audit trail ensures transparency. If a human overrides an agent's context value, the change is recorded and can be reviewed later.

from fcc.collaboration.context import SharedContext

ctx = SharedContext()
ctx.set("objective", "Evaluate data pipeline design", actor="human-alice")
ctx.set("objective", "Evaluate and optimize data pipeline", actor="agent-BC")
# ctx._history has 2 entries, showing the progression

The SharedContext is embedded in the CollaborationSession snapshot via to_dict(), so it persists across save/load cycles.

Putting It All Together

A typical HITL workflow looks like this:

  1. Create a session with gates at the Find-to-Create and Create-to-Critique transitions.
  2. Start the session and add agent turns as the simulation runs.
  3. At each gate, evaluate the deliverable and record the decision.
  4. If a gate returns NEEDS_REVISION, add human feedback as a turn, then resume agent processing.
  5. Complete the session and save the recording.

The collaboration engine integrates with the plugin architecture (see Chapter 6), the event bus (see Chapter 7), and the governance layer (see Chapter 9).

Key Takeaways

  • The collaboration engine uses 11 frozen dataclasses for safe, serializable state.
  • Sessions follow a create-start-turn-gate-complete lifecycle.
  • The ScoringEngine quantifies deliverable quality and drives approval decisions.
  • SharedContext provides an auditable key-value workspace for cross-turn data sharing.
  • SessionRecorder enables full save/load/replay of collaboration sessions.