Testing Feature Flags and Gradual Rollouts

Feature-flagged codepaths often ship without enough test coverage. The result is regressions when a flag flips globally. This guide shows a clean pattern to test both flag states.

Core Principle

Every flag with user impact should have two tests:

Flag OFF: legacy behavior still works.
Flag ON: new behavior works and old behavior is gone.

Inject Flags Deterministically

Prefer server-side or API-level flag overrides in test setup.

tests/flags/helpers.ts

import { APIRequestContext } from '@playwright/test';

export async function setFlag(request: APIRequestContext, key: string, enabled: boolean) {
  // Example test helper endpoint. Replace with your own flag service override API.
  await request.post('/test-support/flags', { data: { key, enabled } });
}

tests/flags/new-checkout.spec.ts

import { test, expect } from '@playwright/test';
import { setFlag } from './helpers';

test('legacy checkout when flag is OFF', async ({ page, request }) => {
  await setFlag(request, 'new_checkout', false);
  await page.goto('/checkout');
  await expect(page.getByText('Classic checkout')).toBeVisible();
});

test('new checkout when flag is ON', async ({ page, request }) => {
  await setFlag(request, 'new_checkout', true);
  await page.goto('/checkout');
  await expect(page.getByText('One-page checkout')).toBeVisible();
});

Rollout-Specific Cases

User targeted by rule gets feature; non-targeted user does not.
Percent rollout boundary (e.g. 10%) is deterministic for known users.
Kill switch immediately reverts behavior.
Cached flag values refresh after re-login.

Cleanup and Isolation

Reset flag overrides in afterEach.
Avoid sharing a mutable rollout environment across parallel tests.
Keep one project dedicated to flag tests when global toggles are involved.

Stably Features to Use During Rollouts

Use Environments to map rollout targets (e.g. Staging, Canary, Production) and run with --env.
Run flag tests in Stably Cloud before and after each rollout milestone.
Add Scheduled Test Runs around rollout windows for automated verification.
Configure Alerts & Notifications for immediate rollback signals.
For scheduled rollout suites, enable Autofix where appropriate to reduce maintenance on test-only selector drift.

​Core Principle

​Inject Flags Deterministically

​Rollout-Specific Cases

​Cleanup and Isolation

​Stably Features to Use During Rollouts

Core Principle

Inject Flags Deterministically

Rollout-Specific Cases

Cleanup and Isolation

Stably Features to Use During Rollouts