feat(supervisor): wide events + warm-start trace propagation#3669
feat(supervisor): wide events + warm-start trace propagation#3669nicktrn wants to merge 4 commits into
Conversation
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (7)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (6)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
WalkthroughThis PR adds a supervisor-wide "wide events" observability system: new wideEvents/ modules provide types (State/Phase/Error), traceparent parsing, AsyncLocalStorage context, phase timing/recording, JSON serialization (emit) to stdout, lifecycle middleware (runWideEvent/emitOneShot) and helpers (setMeta/setExtra). Tests cover parsing, state creation, recording, emission, and middleware. Environment flags TRIGGER_WIDE_EVENTS_ENABLED and TRIGGER_WIDE_EVENTS_NOISY_ROUTES gate behavior. Supervisor and WorkloadServer are wired to create and emit wide events across the dequeue loop, HTTP routes, and socket lifecycle. Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Adds wide-event observability for the supervisor: one flat-keyed JSON line per dequeue iteration, workload-server route, and run socket lifecycle event. Events carry
trace_idsourced from the inbound W3C traceparent plusmeta.run_idand related identifiers, so they join across services by run.The outbound warm-start POST also forwards the inbound traceparent so the upstream receiver continues the same trace instead of minting a new one.
Off by default behind
TRIGGER_WIDE_EVENTS_ENABLED. With the flag off, no events are emitted, no ALS state is allocated, and the outbound warm-start request is unchanged — every call site was audited to confirm the off path is byte-identical to current behavior.Dequeue-path phase timings recorded under
phase.<name>.duration_ms:restore,warm_start,workload_create. Apath_takenextra distinguishesrestore/warm_start/cold_create/skipped_no_image.Refs TRI-9480.