Research Methodology Guide¶

This guide describes how to use the FCC framework as a research instrument for studying multi-agent workflows, AI governance, team collaboration, and knowledge management. It covers experiment design, data collection, analysis, reproducibility, and ethical considerations.

FCC as a Research Instrument¶

The FCC framework provides several properties that make it suitable for controlled experiments:

Deterministic mock mode -- Simulations produce identical outputs for identical inputs, enabling reproducible experiments without API costs
Configurable personas -- Researchers can systematically vary persona attributes (dimensions, constraints, archetypes) as independent variables
Structured traces -- Every simulation produces a trace with timestamped steps, persona assignments, and phase labels
Scoring engine -- Deliverable quality can be evaluated numerically (1-5 scale) with configurable criteria
Event bus -- All system events are captured and replayable for post-hoc analysis
Knowledge graphs -- Relationships between entities are explicit and queryable

Experiment Design with Persona Configurations¶

Independent Variables¶

FCC experiments typically manipulate one or more of the following:

Variable	How to Manipulate	Example Hypothesis
Number of personas	Add/remove personas from the registry	More personas improve output diversity
R.I.S.C.E.A.R. constraints	Modify the constraints field	Stricter constraints reduce errors
Governance tier	Change from preferred to hard-stop	Hard-stop rules reduce quality variance
Workflow graph size	Use 5-node vs. 24-node graph	Longer workflows improve completeness
Persona dimensions	Vary dimension attribute values	Higher curiosity scores correlate with broader output
Cross-reference density	Add/remove cross-reference entries	Denser collaboration networks improve coherence

Dependent Variables¶

Variable	How to Measure	FCC Source
Output quality	ScoringEngine evaluation (1-5 scale)	`fcc.collaboration.scoring`
Task completion rate	Percentage of workflow nodes completed	Simulation trace
Collaboration turn count	Number of turns in collaboration session	Session recording
Gate pass rate	Percentage of approval gates passed	Collaboration engine
Event diversity	Number of distinct event types emitted	Event bus log
Processing time	Duration from first to last trace step	Trace timestamps

Control Variables¶

To isolate the effect of your independent variable, hold constant:

Python version and FCC version
Workflow graph (unless graph size is the independent variable)
Mock mode vs. AI mode
Random seeds (mock mode is deterministic by default)
Scoring criteria and thresholds

Experimental Design Template¶

Title: [Effect of X on Y in FCC-mediated workflows]

Hypothesis: [H1: ...]

Independent Variable: [e.g., number of governance hard-stop rules]
Levels: [e.g., 0, 2, 5, 10]

Dependent Variable: [e.g., quality gate pass rate]

Control Variables:
  - Workflow: extended_sequence (20 nodes)
  - Personas: core_personas.yaml (unmodified)
  - Mode: mock
  - Scoring threshold: 3.5

Procedure:
  1. For each level of the independent variable:
     a. Configure governance with N hard-stop rules
     b. Run 10 simulation iterations
     c. Record quality scores and gate outcomes
  2. Aggregate results and perform statistical analysis

Expected Results: [...]

Data Collection via Simulation Traces¶

Trace Structure¶

Each simulation run produces a trace containing:

{
    "trace_id": "uuid-...",
    "workflow_graph": "extended_sequence",
    "steps": [
        {
            "step_index": 0,
            "node_id": "n1",
            "phase": "Find",
            "persona_id": "RC",
            "response": "...",
            "timestamp": "2026-03-30T10:00:00Z"
        },
        ...
    ]
}

Collecting Data¶

from fcc.simulation.engine import SimulationEngine
from fcc.messaging.bus import EventBus

bus = EventBus()
collected_events = []

def collector(event):
    collected_events.append(event.to_dict())

bus.subscribe(collector)

engine = SimulationEngine(mock=True, event_bus=bus)
trace = engine.run(workflow_graph, persona_registry)

# Save trace for analysis
import json
with open("experiment_trace.json", "w") as f:
    json.dump(trace.to_dict(), f, indent=2)

# Save events
with open("experiment_events.json", "w") as f:
    json.dump(collected_events, f, indent=2)

Multi-Run Data Collection¶

results = []
for run_id in range(10):
    trace = engine.run(workflow_graph, persona_registry)
    results.append({
        "run_id": run_id,
        "steps": len(trace.steps),
        "phases": [s.phase for s in trace.steps],
    })

Analysis via Scoring Engine¶

Quality Scoring¶

from fcc.collaboration.scoring import ScoringEngine

scorer = ScoringEngine()

# Score a deliverable
score = scorer.evaluate(
    deliverable="Generated API documentation...",
    criteria=["completeness", "accuracy", "clarity"],
)
print(f"Quality score: {score.overall} / 5.0")

Statistical Analysis¶

With collected data, apply standard statistical methods:

import statistics

scores = [run["quality_score"] for run in results]
print(f"Mean: {statistics.mean(scores):.2f}")
print(f"Std Dev: {statistics.stdev(scores):.2f}")
print(f"Median: {statistics.median(scores):.2f}")

For comparing experimental conditions, use: - t-test (2 conditions): Compare means between two persona configurations - ANOVA (3+ conditions): Compare means across multiple governance levels - Chi-squared (categorical): Compare gate pass/fail rates across conditions - Correlation (continuous): Relate dimension scores to quality outcomes

Publication Preparation Workflow¶

Recommended Structure¶

Introduction: Motivation for studying multi-agent workflows with FCC
Related Work: Position relative to existing multi-agent frameworks (AutoGen, CrewAI, MetaGPT)
Methodology: FCC configuration, experimental design, data collection procedure
Results: Statistical analysis of simulation traces and quality scores
Discussion: Interpretation, limitations, threats to validity
Reproducibility Package: Link to FCC version, configuration files, and analysis scripts

Reproducibility Package Contents¶

reproducibility/
  README.md
  requirements.txt          # Pinned FCC version
  config/
    personas.yaml           # Exact persona configurations used
    workflow.json            # Workflow graph used
    governance.yaml          # Governance rules used
  scripts/
    run_experiment.py        # Data collection script
    analyze_results.py       # Statistical analysis
  data/
    raw_traces/              # Simulation traces (JSON)
    processed/               # Aggregated results (CSV)

Citation¶

@software{fcc_framework,
  title = {FCC Agent Team Framework},
  author = {Information Collective, LLC},
  year = {2026},
  url = {https://github.com/rollingthunderfourtytwo-afk/l2_fcc_agent_team_ext},
  version = {0.8.0}
}

IRB Considerations for AI-Mediated Research¶

When IRB Review Is Needed¶

If your research involves: - Human participants interacting with FCC agents through the collaboration engine, IRB review is typically required - Purely computational experiments (mock simulations without human participants) generally do not require IRB review - Surveys or interviews about FCC usage require standard human subjects protocols

Key Considerations¶

Concern	Mitigation
Informed consent	Participants must know they are interacting with AI agents
Data privacy	Session recordings may contain personally identifiable information; anonymize before publication
Deception	If participants are unaware that "agents" are AI-generated, this constitutes deception
Power dynamics	In classroom settings, student participation should be voluntary and not affect grades
Data retention	Define retention policies for collaboration session recordings

Recommended Protocol¶

Obtain IRB approval before collecting data from human participants
Provide informed consent forms that explain: the role of AI agents, data collection scope, how data will be stored and shared
Allow participants to withdraw at any time and have their data deleted
Anonymize session recordings before analysis: replace names with participant codes
Store data on encrypted, access-controlled systems

Reproducibility Guidelines¶

Pinning Versions¶

Always record exact versions:

import fcc
print(f"FCC version: {fcc.__version__}")
print(f"Python version: {sys.version}")

Configuration Archival¶

Save the exact YAML/JSON configuration used for each experiment:

import shutil
shutil.copy("src/fcc/data/personas/core_personas.yaml", "experiment/config/")
shutil.copy("src/fcc/data/workflows/extended_sequence.json", "experiment/config/")

Mock Mode for Reproducibility¶

Mock mode produces identical results across runs, making it ideal for reproducible experiments. When using AI mode, record the model name, temperature, and any other parameters:

config = {
    "mode": "mock",  # or "ai"
    "model": "claude-sonnet-4-20250514",
    "temperature": 0.7,
    "max_tokens": 4096,
    "fcc_version": fcc.__version__,
    "python_version": sys.version,
}

Share raw traces and analysis scripts alongside publications. Use stable identifiers (DOI, Zenodo) for long-term accessibility.