Core Principle
Every flag with user impact should have two tests:- Flag OFF: legacy behavior still works.
- Flag ON: new behavior works and old behavior is gone.
Inject Flags Deterministically
Prefer server-side or API-level flag overrides in test setup.tests/flags/helpers.ts
tests/flags/new-checkout.spec.ts
Rollout-Specific Cases
- User targeted by rule gets feature; non-targeted user does not.
- Percent rollout boundary (e.g. 10%) is deterministic for known users.
- Kill switch immediately reverts behavior.
- Cached flag values refresh after re-login.
Cleanup and Isolation
- Reset flag overrides in
afterEach. - Avoid sharing a mutable rollout environment across parallel workers.
- Keep one project dedicated to flag tests when global toggles are involved.
Stably Features to Use During Rollouts
- Use Environments to map rollout targets (e.g.
Staging,Canary,Production) and run with--env. - Run flag tests in Stably Cloud before and after each rollout milestone.
- Add Scheduled Test Runs around rollout windows for automated verification.
- Configure Alerts & Notifications for immediate rollback signals.
- For scheduled rollout suites, enable Autofix where appropriate to reduce maintenance on test-only selector drift.