Full-Stack Ecosystem -- v1.3.5.2 Addendum¶
This addendum extends the Full-Stack Ecosystem Demo with the v1.3.5.2 end-to-end scenario. It wires together the four v1.3.5.2 addendum scenarios -- Web Frontend stress test, Distiller Bridge vocabulary evolution, Open Science cross-project FAIR audit, and Sky-Parlour context-enricher integration -- under a single OpenTelemetry trace and a single event bus.
The goal is operational: validate that the four subsystems behave correctly when their failure and recovery cycles overlap, not just when exercised in isolation.
Scenario Overview¶
Topology¶
+--------------------+ vocabulary.mismatch
| Distiller Bridge |---------------------+
+--------------------+ |
| v
| workflow.step +--------------+
v | EventBus |
+--------------------+ subscribe | (central) |
| Web Frontend |<-------------| |
| stress panel | +--------------+
+--------------------+ ^ ^
| | |
| trace (OTel) | |
v | |
+--------------------+ fair.* | | workflow.step
| Open Science |----------------+ +----------------+
| FAIR audit | |
+--------------------+ |
v
+--------------------+
| Sky-Parlour |
| context enricher |
+--------------------+
Participants¶
| Subsystem | Role | Primary Event Contribution | Addendum |
|---|---|---|---|
| Web Frontend | Stress publisher + observer | workflow.step at ~1000/sec |
web-frontend-v1352-addendum.md |
| Distiller Bridge | Vocabulary change source | vocabulary.mismatch |
distiller-bridge-v1352-addendum.md |
| Open Science | Subscriber + auditor | fair.*, cross_project.* |
open-science-v1352-addendum.md |
| Sky-Parlour | Downstream visualizer | consumes everything above | skyparlour-v1352-addendum.md |
Reference Notebook¶
The scenario is captured as a runnable walkthrough in
notebooks/32_full_stack_ecosystem_demo.ipynb. The notebook imports
each subsystem's setup helper, registers subscribers in the correct
order, drives the scenario from the top of the topology down, and
records metrics and spans for post-run inspection. Use the notebook as
the authoritative starting point; the sections below describe the same
run at prose level.
Sequencing the End-to-End Run¶
The scenario runs in four overlapping phases. Each phase introduces events that the downstream phases consume.
Phase A -- Quiescent Baseline (t+0 to t+5 s)¶
The event bus is started, all four subsystems attach their subscribers, and the OTel tracer is initialized. No synthetic traffic yet. The baseline captures idle metrics so that later deltas are interpretable.
Phase B -- Vocabulary Evolution (t+5 s to t+10 s)¶
The Distiller Bridge receives a Fornax class map with two additions
and one removal (see the Distiller Bridge addendum
for the exact deltas). The bridge emits three vocabulary.mismatch
events. The Open Science subscriber schedules a cross-project
reconciliation check and begins collecting evidence.
Phase C -- Stress Burst (t+10 s to t+20 s)¶
The Web Frontend stress harness begins publishing workflow.step
events at 1000/sec for 10 seconds. Sky-Parlour's bridge forwards
the events to the Phase-17 enricher. Open Science remains subscribed;
its fair.check cadence is set low enough (every 30 seconds) that
the burst does not trigger a duplicate check.
Phase D -- Cross-Project FAIR Audit (t+20 s to t+35 s)¶
With the vocabulary change reconciled and the stress burst concluded,
Open Science triggers a full assess_cross_project run against FCC,
PAOM, and AOME. Sky-Parlour renders the resulting fair.principle.
evaluated events live. The final cross_project.assessment.completed
event carries the headline scores.
OpenTelemetry Trace Layout¶
A single root span -- ecosystem.v1352.run -- wraps the entire
scenario. Each phase and each subsystem adds nested spans. The
resulting trace is self-contained: operators can filter by span name
to zoom into any subsystem.
Top-Level Span Hierarchy¶
| Level | Span Name | Parent | Typical Duration |
|---|---|---|---|
| 0 | ecosystem.v1352.run |
(root) | ~35 s |
| 1 | phase.baseline |
root | 5 s |
| 1 | phase.vocab_evolution |
root | 5 s |
| 1 | phase.stress_burst |
root | 10 s |
| 1 | phase.fair_audit |
root | 15 s |
Subsystem-Level Span Names¶
| Subsystem | Span Name | Emitted Per |
|---|---|---|
| Event bus | event.publish |
Event |
| Subscribers | subscriber.invoke |
Delivery |
| Action engine | workflow.step |
Step |
| Vocabulary loader | vocab.verify |
Verification run |
| Compliance | compliance.reaudit |
Affected requirement |
| FAIR audit | fair.principle.evaluate |
Principle per project |
| Cross-project | federation.assess |
Assessment run |
| Visualization | viz.bridge.forward |
Forwarded payload |
Correlating Cross-Subsystem Spans¶
Every event carries a trace_id attribute generated at publish time
by the event bus. Subscribers attach their spans to that trace ID,
which means a single event produced in phase B can be followed
through its reception in phase C (Sky-Parlour) and its contribution
to phase D (Open Science). The trace viewer is the correct tool for
walking this path; logs alone are insufficient because the bus is
asynchronous.
Metric Names and Healthy Ranges¶
The scenario emits the following metrics. All are standard
FccMetrics types.
| Metric | Type | Phase Where Relevant | Healthy Range |
|---|---|---|---|
events.published |
Counter | all | rising monotonically |
events.delivered |
Counter | all | within 1% of published |
events.dropped |
Counter | C | < 1% of published |
events.dlq.size |
Gauge | B, C | < 100 |
vocab.verify.diff.count |
Counter | B | matches injected delta (3) |
compliance.reaudit.scheduled |
Counter | B | matches affected requirements |
fair.principle.score |
Histogram | D | centred above 0.80 |
federation.cross_project_score |
Gauge | D | >= 0.85 for reference run |
viz.bridge.delivered |
Counter | C, D | tracks publishes to Sky-Parlour |
viz.bridge.delivered.latency_ms |
Histogram | C, D | p95 < 50 ms |
Subscriber Ordering¶
The subscriber set is registered in a specific order to keep the scenario deterministic. Do not reorder these unless the scenario is being deliberately modified.
ComplianceSubscriberonvocabulary.mismatch.FAIRAuditSubscriberoncompliance.reaudit.scheduled.VisualizationBridgesubscribing to the bus (broad filter).StressPanelSubscriberonevents.*.CrossProjectReporteroncross_project.assessment.completed.
Ordering matters because (1) must see the mismatch before (2) is asked to schedule, and (3) must be attached before (4) starts publishing so that the enricher receives the stress traffic.
Full Run -- Minimal Python¶
The following is the condensed form of the notebook's setup cell. It is not a substitute for the notebook, which adds inspection tooling between phases.
from fcc.messaging.bus import EventBus
from fcc.observability.tracing import FccTracer
from fcc.observability.metrics import FccMetrics
from fcc.compliance.subscriber import ComplianceSubscriber
from fcc.visualization.bridge import VisualizationBridge
from fcc.objectmodel.federation import assess_cross_project
from fcc.objectmodel.examples import create_sample_model
bus = EventBus.default()
tracer = FccTracer.default()
metrics = FccMetrics.default()
with tracer.start_as_root("ecosystem.v1352.run"):
# Phase A -- baseline
with tracer.start_as_current("phase.baseline"):
bus.subscribe("vocabulary.mismatch", ComplianceSubscriber().handle)
bridge = VisualizationBridge.default()
bus.subscribe_all(lambda ev: bridge.forward(ev, target="skyparlour"))
# Phase B -- vocabulary evolution
with tracer.start_as_current("phase.vocab_evolution"):
run_distiller_vocab_scenario()
# Phase C -- stress burst
with tracer.start_as_current("phase.stress_burst"):
run_stress_harness(events_per_second=1000, duration_s=10)
# Phase D -- FAIR audit
with tracer.start_as_current("phase.fair_audit"):
fcc_f = create_sample_model("fcc")
paom_f = create_sample_model("paom")
aome_f = create_sample_model("aome")
assessment = assess_cross_project([fcc_f, paom_f, aome_f])
bus.publish_event("cross_project.assessment.completed",
payload=assessment.to_dict())
Reference Run Results¶
The reference run on the v1.3.5.2 developer machine produced the following headline numbers. Treat these as rough expectations rather than fixed targets -- they are hardware- and load-dependent.
| Metric | Value |
|---|---|
| Total events published | ~10,030 |
| Events delivered | ~9,995 |
| Events dropped | 35 (Sky-Parlour stress backpressure) |
| DLQ peak depth | 12 |
| Vocabulary mismatch events | 3 |
| Compliance requirements re-audited | 4 |
| FAIR principles evaluated | 30 (10 x 3 projects) |
cross_project_score |
0.89 |
| Run wall-clock time | ~35 s |
| Root span count | 1 |
| Total span count | ~11,800 |
Failure Modes and Recovery¶
The scenario is designed to surface the common failure modes that occur in production when these subsystems are deployed together.
| Failure Mode | Where Introduced | Expected Recovery Path |
|---|---|---|
| Slow Sky-Parlour enricher | Phase C stress burst | events.dropped rises; scenario continues |
| Invalid vocabulary YAML | Phase B | Loader raises, scenario aborts; fix YAML and re-run |
| Compliance subscriber exception | Phase B handler | DLQ captures the event; scenario continues |
| Cross-project facade unreachable | Phase D | assess_cross_project returns partial result with recommendations |
| OTel exporter failure | any phase | Spans dropped silently; metrics still recorded |
Observability Checklist for Operators¶
When reviewing a completed run, an operator should look at the following indicators in order. Each has a single clear "healthy" or "unhealthy" answer.
- Was the root span emitted? (trace viewer)
- Did
events.publishedandevents.deliveredstay within 1%? (metrics) - Was DLQ depth stable and bounded? (gauge)
- Did vocabulary mismatch trigger the expected re-audits? (counter match)
- Did
federation.cross_project_scoreexceed the configured threshold? (gauge) - Did Sky-Parlour receive payloads throughout the stress phase? (viz metric)
- Are there any spans with unexpectedly long durations? (trace viewer)
If any answer is "no", the per-subsystem addendum contains the detailed diagnostic procedure for that subsystem.
Tips¶
- Capture a clean baseline run before introducing any workload. The delta between baseline and full run is far more useful than a full run in isolation.
- Use the trace ID from a single representative event to walk the full cross-subsystem path. This is the highest-leverage debugging technique in the ecosystem.
- The reference notebook commits its run artefacts under
notebooks/_runs/; keep those committed so that regressions show up in code review rather than production. - Align OTel clocks across subsystems; even a 100 ms skew makes trace layout confusing.
See also¶
- Full-Stack Ecosystem Demo -- Base demo and subsystem inventory
- Web Frontend v1.3.5.2 Addendum -- WebSocket stress test
- Distiller Bridge v1.3.5.2 Addendum -- Vocabulary evolution
- Open Science v1.3.5.2 Addendum -- Cross-project FAIR audit
- Sky-Parlour v1.3.5.2 Addendum -- Phase-17 context enricher
- Architecture Sitemap -- Index of diagrams referenced by this scenario
- Reference notebook:
notebooks/32_full_stack_ecosystem_demo.ipynb