Skip to main content
Write Stably SDK tests faster and more reliably by leveraging AI coding assistants like Cursor or Claude Code with Playwright MCP (Model Context Protocol). These tools can generate complete, production-ready test suites that take full advantage of Stably’s AI capabilities.

Why Use AI Assistants for Test Creation?

Faster Development

Generate complete test suites in minutes instead of hours

Best Practices Built-in

AI automatically applies Stably SDK patterns like .describe() and AI assertions

Context-Aware

Playwright MCP gives AI direct access to your browser state and page structure

Reduced Errors

AI writes tests that leverage auto-heal from day one

Prerequisites

1

Install Stably SDK

Follow the SDK Setup Guide guide to install and configure the Stably SDK in your project.
2

Choose Your AI Assistant

Install one of these AI coding assistants:
  • Cursor — AI-first code editor with deep IDE integration
  • Claude Code — Standalone AI assistant with MCP support
3

Configure Playwright MCP

Playwright MCP allows the AI to interact with browsers directly, inspect page state, and generate accurate selectors.
  • Cursor
  • Claude Code
Add to your Cursor settings (~/.cursor/mcp.json or via Settings → MCP):
{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["-y", "@playwright/mcp@latest"]
    }
  }
}

Setup AI Rules for Stably SDK

AI coding assistants work best when they understand the specific patterns and capabilities of your testing framework. Configure your assistant with Stably SDK rules:
1

Copy the Stably SDK AI Rules

# Stably SDK — AI Rules (Playwright‑compatible)

**Assumption:** Full Playwright parity unless noted below.

## Import

```ts
import { test, expect } from "@stablyai/playwright-test";
```

## Install & Setup

```bash
npm install @playwright/test @stablyai/playwright-test
export STABLY_API_KEY=YOUR_KEY
```

```ts
import { setApiKey } from "@stablyai/playwright-test";
setApiKey("YOUR_KEY");
```

## When to Use Stably SDK vs Playwright

**Prioritization:**
1. **Test accuracy and stability are the most important factors** - prioritize reliability over cost/speed.
2. **Otherwise, use Playwright whenever possible** since it's cheaper and faster.
3. **For interactions:** If the interaction will be hard to express as Playwright or will be too brittle that way (e.g., the scroll amount changes every time), then use `agent.act()`. **Any canvas-related operations, or any drag/click operations that require coordinates, must use `agent.act()`** (more semantic meaning, and less flaky).
4. **For assertions:** Use Playwright if it fulfills the purpose. But if the assertion is very visual-heavy, use Stably's `toMatchScreenshotPrompt`.
5. **Use Stably SDK methods if it helps your tests pass** - when Playwright methods are insufficient or unreliable.

## AI Assertions (intent‑based visuals)

```ts
await expect(page).toMatchScreenshotPrompt(
  "Shows revenue trend chart and spotlight card",
  { timeout: 30_000 }
);
await expect(page.locator(".header"))
  .toMatchScreenshotPrompt("Nav with avatar and bell icon");
```

**Signature:** `expect(page|locator).toMatchScreenshotPrompt(prompt: string, options?: ScreenshotOptions)`

* Use for **dynamic** UIs; keep prompts specific; scope with elements (using locators) when possible.
* **Consider whether you need `fullPage: true`**: Ask yourself if the assertion requires content beyond the visible viewport (e.g., long scrollable lists, full page layout checks). If only viewport content matters, omit `fullPage: true` — it's faster and cheaper. Use it only when you genuinely need to capture content outside the browser window's visible area. 

## AI Extraction (visual → data)

```ts
const txt = await page.extract("List revenue, active users, and churn rate");
```

Typed with Zod:

```ts
import { z } from "zod";
const Metrics = z.object({ revenue: z.string(), activeUsers: z.number(), churnRate: z.number() });
const m = await page.extract("Return revenue (currency), active users, churn %", { schema: Metrics });
```

**Signatures:**

* `page.extract(prompt: string): Promise<string>`
* `page.extract<T extends z.AnyZodObject>(prompt, { schema: T }): Promise<z.output<T>>`

## AI Agent (autonomous workflows)

Use the `agent` fixture to execute complex, human-like workflows:

```ts
test("complex workflow", async ({ agent, page }) => {
  await page.goto("/orders");
  await agent.act("Find the first pending order and mark it as shipped", { page });
});

// Or create manually
const agent = context.newAgent();
await agent.act("Your task here", { page, maxCycles: 10 }); // split into smaller steps if possible
```

**Signature:** `agent.act(prompt: string, options: { page: Page, maxCycles?: number, model?: string }): Promise<{ success: boolean }>`

* Default maxCycles: 30
* Supported models: `anthropic/claude-sonnet-4-5-20250929` (default), `google/gemini-2.5-computer-use-preview-10-2025`

### Passing Variables to Prompts

You can use template literals to pass variables into your prompts:

```ts
const duration = 24 * 7 * 60;
await agent.act(`Enter the duration of ${duration} seconds`, { page });

const username = "john.doe@example.com";
await agent.act(`Login with username ${username}`, { page });
```

### Self-Contained Prompts

All prompts to Stably SDK AI methods (agent.act, toMatchScreenshotPrompt, extract) must be self-contained with all necessary information:

1. **No implicit references to outside context** - prompts cannot reference previous actions or state that the AI method doesn't have access to:
   - ❌ Bad: `agent.act("Verify the field you just filled in the form is 4", { page })`
   - ✅ Good: `agent.act("Verify the 'timeout' field in the form has value 4", { page })`
   - ❌ Bad: `agent.act("Pick something that's not in the previous step", { page })`
   - ✅ Good: `const selectedItem = "Option A"; await agent.act(\`Pick an option other than ${selectedItem}\`, { page })`

2. **Pass information between AI methods using explicit variables:**
   ```ts
   // Extract data, then use it in next action
   const orderId = await page.extract("Get the order ID from the first row");
   await agent.act(`Cancel order with ID ${orderId}`, { page });
   ```

3. **Include detailed instructions and domain knowledge** to help the AI perform the task successfully:
   - ❌ Bad: `agent.act("Fill in the form", { page })`
   - ✅ Good: `agent.act("Fill in the form with test data. On page 4 you might run into a popup asking for premium features - just click 'Skip' or 'Cancel' to ignore it", { page })`

### Optimizing Agent Performance

**IMPORTANT:** The fewer actions/cycles agent.act() needs to do, the better it performs. Offload work to Playwright code when possible:

1. If your prompt has work that could be done by Playwright code, use Playwright for that work, and only use agent.act() for actions that are hard for Playwright (canvas operations, dynamic decision making, etc.)
2. If your prompt has repetition (e.g., do it 5 times), calculations (e.g., type 24*7*60 seconds), or other code-suitable tasks, use code for those, and only have agent.act() perform the agent-suitable part.
3. If your prompt has an if/else condition that can be expressed in code, use code for the condition, and only have agent.act() perform the agent-suitable part.

**Examples:**
- ❌ Bad: `"Click the button 5 times"` 
- ✅ Good: `"Click the button"` (and include this in a loop that runs 5 times)
- ❌ Bad: `"enter the duration of 24*7*60 seconds"` 
- ✅ Good: Calculate in code (`const sum = 24*7*60`), then use `\`enter the duration of ${sum} seconds\``

## CI Reporter / Cloud

```bash
npm install @stablyai/playwright-test
```

```ts
// playwright.config.ts
import { defineConfig } from "@playwright/test";
import { stablyReporter } from "@stablyai/playwright-test";

export default defineConfig({
  reporter: [
    ["list"],
    stablyReporter({ apiKey: process.env.STABLY_API_KEY, projectId: "YOUR_PROJECT_ID" }),
  ],
});
```

## Commands

```bash
npx playwright test   # preferred to enable full auto‑heal path
# All Playwright CLI flags still work (headed, ui, project, file filters…)

# When running tests for debugging/getting stacktraces:
npx playwright test --reporter=list  # disable HTML reporter, shows terminal output directly
```

## Best Practices

* **CRITICAL: All locators must use the `.describe()` method** for readability in trace views and test reports. Example: `page.getByRole('button', { name: 'Submit' }).describe('Submit button')` or `page.locator('table tbody tr').first().describe('First table row')`
* Scope visual checks with locators; keep prompts specific with labels/units.
* Use `toHaveScreenshot` for stable pixel‑perfect UIs; `toMatchScreenshotPrompt` for dynamic UIs.
* **Be deliberate with `fullPage: true`**: Default to viewport-only screenshots. Only use `fullPage: true` when your assertion genuinely requires content beyond the visible viewport (e.g., verifying footer content on a long page, checking full scrollable lists). Viewport captures are faster and more cost-effective. 

## Troubleshooting

* **Slow assertions** → scope visuals; reduce viewport.
* **Agent stops early** → increase `maxCycles` or break task into smaller steps.

## Minimal Template

```ts
import { test, expect } from "@stablyai/playwright-test";

test("AI‑enhanced dashboard", async ({ page, agent }) => {
  await page.goto("/dashboard");
  
  // Use agent for complex workflows
  await agent.act("Navigate to settings and enable notifications", { page });
  
  // Use AI assertions for dynamic content
  await expect(page).toMatchScreenshotPrompt(
    "Dashboard shows revenue chart (>= 6 months) and account spotlight card"
  );
});
```
2

Add Rules to Your AI Assistant

  • Cursor
  • Claude Code
Add the AI rules through Cursor’s Rules feature:
  1. Open Cursor Settings → Rules (or press Cmd/Ctrl + Shift + J)
  2. Create a new stably-sdk-rules.mdc rule file
  3. Paste the AI rules content
  4. Configure when to apply the rule:
    • Set appropriate file globs (e.g., **/*.spec.ts, **/tests/**/*.ts)
    • Choose whether to always apply or apply based on file patterns
Cursor rules support project-specific and global configurations. You can create multiple rule files and control their scope using globs and the alwaysApply setting.
After adding rules to your AI assistant, restart it or start a new conversation to ensure the rules are loaded.
3

Verify Configuration

Test that your AI assistant understands Stably SDK by asking:
“Generate a test that uses Stably SDK’s AI assertion to verify a dashboard page”
The AI should generate code using toMatchScreenshotPrompt() instead of basic Playwright assertions.

Creating Tests with AI

Basic Workflow

1

Navigate to the Page

Use Playwright MCP to open the page you want to test:Ask the AI:
“Open a browser and navigate to https://app.example.com/dashboard
The AI will use Playwright MCP to launch a browser and navigate to the page.
2

Explore the Page

Ask the AI to inspect elements and understand the page structure:
“What elements are visible on this page? Show me the main navigation and action buttons”
The AI uses MCP to capture page snapshots and identify interactive elements.
3

Generate Test Code

Describe the user flow you want to test:
“Generate a Stably SDK test that:
  1. Logs into the app with test credentials
  2. Clicks the ‘Create Project’ button
  3. Fills in the project form
  4. Uses an AI assertion to verify the success message appears”
The AI generates production-ready code using Stably SDK patterns.
4

Refine and Iterate

Review the generated test and refine as needed:
“Add error handling for network timeouts and use .describe() on the submit button locator”
The AI updates the test with improvements.

Example: Generated Test

Here’s an example of a test generated by an AI assistant with Stably SDK rules:
import { test, expect } from "@stablyai/playwright-test";

test("create project flow with AI validation", async ({ page }) => {
  // Navigate to the app
  await page.goto("https://app.example.com");
  
  // Login with test credentials
  await page.getByLabel("Email").fill("test@example.com");
  await page.getByLabel("Password").fill("TestPassword123");
  await page.getByRole("button", { name: "Sign In" })
    .describe("Login submit button")
    .click();
  
  // Wait for dashboard to load
  await expect(page).toHaveURL(/.*dashboard/);
  
  // Click create project button
  await page.getByRole("button", { name: "Create Project" })
    .describe("Main CTA to create new project")
    .click();
  
  // Fill project form
  await page.getByLabel("Project Name").fill("E2E Test Project");
  await page.getByLabel("Description").fill("Automated test project");
  await page.getByRole("combobox", { name: "Project Type" })
    .describe("Project type dropdown")
    .selectOption("Web Application");
  
  // Submit form
  await page.getByRole("button", { name: "Create" })
    .describe("Project creation submit button")
    .click();
  
  // Use AI assertion to verify success
  await expect(page).toMatchScreenshotPrompt(
    "Success message showing 'Project created successfully' with green checkmark icon",
    { timeout: 30_000 }
  );
  
  // Verify project appears in list
  await page.getByRole("link", { name: "Projects" })
    .describe("Navigation link to projects list")
    .click();
  
  await expect(page.getByRole("heading", { name: "E2E Test Project" }))
    .toBeVisible();
});
The AI automatically:
  • Uses .describe() on critical locators for auto-heal
  • Applies toMatchScreenshotPrompt() for dynamic UI validation
  • Includes proper waits and navigation checks
  • Follows Playwright best practices

Advanced Use Cases

Multi-Step User Flows

Generate complex, multi-page flows by describing the complete journey:
Generate a Stably SDK test for the complete checkout flow:
1. Browse product catalog and add 3 items to cart
2. Proceed to checkout
3. Fill shipping information form
4. Select payment method
5. Use AI assertion to verify order summary shows correct total
6. Complete purchase
7. Use AI extraction to get the order number
8. Verify confirmation page with order number

Data-Driven Tests

Generate tests that use extracted data for validation:
Create a test that:
1. Navigates to the analytics dashboard
2. Uses AI extraction to get revenue, active users, and churn rate from the page
3. Validates that revenue is greater than $10,000
4. Validates that churn rate is below 5%
5. Screenshots the trends chart if validation passes
Example generated code:
import { test, expect } from "@stablyai/playwright-test";
import { z } from "zod";

const MetricsSchema = z.object({
  revenue: z.number(),
  activeUsers: z.number(),
  churnRate: z.number()
});

test("validate analytics dashboard metrics", async ({ page }) => {
  await page.goto("/analytics/dashboard");
  
  // Extract metrics using AI
  const metrics = await page.extract(
    "Return revenue (as number), active users (as number), and churn rate (as percentage number)",
    { schema: MetricsSchema }
  );
  
  // Validate metrics
  expect(metrics.revenue).toBeGreaterThan(10000);
  expect(metrics.churnRate).toBeLessThan(5);
  
  console.log(`Validation passed:`, {
    revenue: `$${metrics.revenue.toLocaleString()}`,
    activeUsers: metrics.activeUsers.toLocaleString(),
    churnRate: `${metrics.churnRate}%`
  });
  
  // Verify the trends chart renders correctly
  await expect(page.locator(".trends-chart"))
    .toMatchScreenshotPrompt("Revenue trend chart showing last 6 months of data");
});

Visual Regression Testing

Generate comprehensive visual tests with AI assertions:
Create a visual regression test suite for the marketing landing page that:
1. Checks hero section with CTA button and value proposition
2. Validates features section shows all 6 feature cards
3. Verifies pricing table with 3 tiers
4. Checks footer has social links and newsletter signup
Use AI assertions for all checks to handle dynamic content

Best Practices

The more detailed your description, the better the generated test. Include:
  • Exact labels and button text
  • Expected outcomes and error states
  • Data formats and validation rules
  • Whether to use AI assertions vs. standard assertions
Instead of manually finding selectors, ask the AI:
  • “What’s the best selector for the submit button on this form?”
  • “Show me all clickable elements in the header”
  • “Find the selector for the error message container”
MCP allows the AI to inspect the live page and suggest robust selectors.
Guide the AI on when to use different assertion types:
  • Standard assertions (toBeVisible(), toHaveText()) for stable, predictable elements
  • AI assertions (toMatchScreenshotPrompt()) for dynamic content, personalized UIs, or complex layouts
  • AI extraction (page.extract()) when you need to validate computed values or extract data for later use
AI-generated tests are a starting point. Refine them by asking:
  • “Add error handling for network failures”
  • “Make the login reusable as a fixture”
  • “Add .describe() to locators that might break”
  • “Include comments explaining the test logic”
Always review generated tests:
  • Run the test to verify it works
  • Check that locators are resilient (using getByRole, getByLabel, etc.)
  • Ensure proper wait conditions
  • Verify timeout values are reasonable
  • Check that API keys and secrets are not hardcoded

Troubleshooting

Solution: Ensure AI rules are properly configured. Try:
  1. Verify stably-sdk-rules.mdc or claude.md file exists in project root
  2. Restart your AI assistant to reload configuration
  3. Explicitly mention in your prompt: “Use Stably SDK patterns with .describe() and toMatchScreenshotPrompt()”
Solution:
  1. Verify MCP configuration is correct in settings JSON
  2. Restart your AI assistant
  3. Check that @playwright/mcp@latest can be installed:
    npx -y @playwright/mcp@latest --help
    
  4. Look for MCP connection errors in your assistant’s console/logs
Solution: Ask the AI to use more semantic selectors:
  • “Add .describe() to all action locators”
  • “Use getByRole instead of CSS selectors”
  • “Prefer getByLabel for form inputs”
  • “Add test-id attributes to critical elements and use getByTestId”
  • “Use viewport screenshots instead of fullPage: true”
Solution: Optimize generated tests:
  • “Reduce timeout values where possible”
  • “Remove unnecessary waits”
  • “Use viewport screenshots instead of fullPage: true”
  • “Combine multiple toMatchScreenshotPrompt() in a single one where logical”
Solution: Guide the AI on assertion selection:
  • “Use toMatchScreenshotPrompt() only for dynamic content that can’t be validated with standard assertions”
  • “Prefer toHaveText() and toBeVisible() for stable, predictable elements”
  • “Reserve AI assertions for visually complex validations”

Example Prompts

For Complete Test Suites

“Generate a complete test suite for our e-commerce site covering: product search, add to cart, checkout, and order confirmation. Use Stably SDK with AI assertions for the product grid and checkout summary. Include data extraction for order total validation.”

For Form Testing

“Create a Stably SDK test for the user registration form. Validate all fields, test error messages, and use an AI assertion to verify the success modal appears after submission. Add .describe() to all form input locators.”

For Visual Testing

“Generate visual regression tests for our component library. Test button variants, card layouts, and navigation menus. Use toMatchScreenshotPrompt() for each component and scope with locators.”

For API + UI Testing

“Write a test that creates a project via API, then verifies it appears correctly in the UI. Extract the project ID from the API response and use it to navigate to the project detail page. Use AI assertion to verify the project details render correctly.”

Next Steps

Additional Resources