Path Z-the CI gate¶
The whatif CLI is a wedge. The destination is something larger:
An engineer sees a failing
whatifcheck in a PR and thinks “I’m not merging until this is green.”
That’s the moment a tool crosses from “useful” to “infrastructure.” Path Z is the trajectory toward it.
Today (the wedge)¶
whatif fork is interactive. You run it from your laptop when you want to test a fix. The output is a Markdown verdict report you read yourself.
v0.2 (CI-ready)¶
Same CLI, with affordances that make CI integration trivial:
whatif.config.yamlso the command-line argument soup lives in version control.Deterministic outputs (sorted JSON keys, stable hashes, seeded baseline sampling).
A 50-line GitHub Action wrapper that just invokes the CLI.
The CLI doesn’t change. Wiring it into PR checks becomes a 30-minute task.
v1.0 (the gate)¶
Path Z fully materialized:
whatifruns on every PR that touches a prompt, model, or tool definition.It pulls real production failures from the last N days plus a representative baseline.
It replays the proposed change against both cohorts.
It blocks the merge if the verdict fails the configured policy.
The verdict report is posted as a PR comment, with evidence visible inline.
The reviewer’s experience:
✗ whatif / experiment-runner-Verdict: Don't Ship
Failures: 14/20 improved, but baseline regressed 6/20.
Median baseline Δ: -0.18 (CI: [-0.24, -0.12])
See verdict report → /artifacts/whatif-report.md
Top regression: trace t_492af ("agent now refuses requests it
previously handled correctly")
That output, on a PR, is the moment.
Why this matters¶
Today’s PR review for an LLM project doesn’t include a check that says:
This prompt change won’t regress your last 50 production failures, or your last 50 baseline successes.
Tomorrow’s should. whatif is positioned to be that check.
What we don’t build in v0.1¶
We don’t build the CI integration. We build the CLI such that CI is a thin wrapper that already-works:
Machine-readable output (JSON).
Exit codes that reflect the configured decision policy (0 / 1 / 2).
Configuration in a file, not arguments.
Deterministic-ish outputs.
Then in v0.2, the GitHub Action wrapper proves the architecture supports it. v1.0 is the destination.
Why the trajectory matters now¶
Naming Path Z changes the architecture even though we don’t build it yet. Every v0.1 design decision is tested against:
Will this still work when the CLI is invoked by a GitHub Action on a PR?
That filter is what produced the JSON output, the exit-code policy, the deterministic-output discipline, and the configuration-in-a-file convention. None of them are needed for interactive use. All are needed for the gate.