Stably AI - AI-Powered Testing Platform Documentation

While you can use your own IDE, we encourage you to also check out our CLI or Web editor.

Write Stably SDK tests faster and more reliably by leveraging AI coding assistants like Cursor or Claude Code with Playwright MCP (Model Context Protocol). These tools can generate complete, production-ready test suites that take full advantage of Stably’s AI capabilities.

Why Use AI Assistants for Test Creation?

Faster Development

Generate complete test suites in minutes instead of hours

Best Practices Built-in

AI automatically applies Stably SDK patterns like .describe() and AI assertions

Context-Aware

Playwright MCP gives AI direct access to your browser state and page structure

Reduced Errors

AI writes tests that leverage auto-heal from day one

Prerequisites

Install Stably SDK

Follow the SDK Setup Guide guide to install and configure the Stably SDK in your project.

Choose Your AI Assistant

Install one of these AI coding assistants:

Cursor — AI-first code editor with deep IDE integration
Claude Code — Standalone AI assistant with MCP support

Configure Playwright MCP

Playwright MCP allows the AI to interact with browsers directly, inspect page state, and generate accurate selectors.

Cursor
Claude Code

Add to your Cursor settings (~/.cursor/mcp.json or via Settings → MCP):

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["-y", "@playwright/mcp@latest"]
    }
  }
}

Add to your Claude desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "playwright": {
      "command": "npx",
       "args": ["-y", "@playwright/mcp@latest"]
    }
  }
}

Setup AI Rules for Stably SDK

AI coding assistants work best when they understand the specific patterns and capabilities of your testing framework. Configure your assistant with Stably SDK rules:

Copy the Stably SDK AI Rules

# Stably SDK — AI Rules (Playwright‑compatible)

**Assumption:** Full Playwright parity unless noted below.

## Import

```ts
import { test, expect } from "@stablyai/playwright-test";
```

## Install & Setup

```bash
npm install @playwright/test @stablyai/playwright-test
export STABLY_API_KEY=YOUR_KEY
```

```ts
import { setApiKey } from "@stablyai/playwright-test";
setApiKey("YOUR_KEY");
```

## When to Use Stably SDK vs Playwright

**Prioritization:**
1. **Test accuracy and stability are the most important factors** - prioritize reliability over cost/speed.
2. **Otherwise, use Playwright whenever possible** since it's cheaper and faster.
3. **For interactions:** If the interaction will be hard to express as Playwright or will be too brittle that way (e.g., the scroll amount changes every time), then use `agent.act()`. **Any canvas-related operations, or any drag/click operations that require coordinates, must use `agent.act()`** (more semantic meaning, and less flaky).
4. **For assertions:** Use Playwright if it fulfills the purpose. But if the assertion is very visual-heavy, use Stably's `toMatchScreenshotPrompt`.
5. **Use Stably SDK methods if it helps your tests pass** - when Playwright methods are insufficient or unreliable.

## AI Assertions (intent‑based visuals)

```ts
await expect(page).toMatchScreenshotPrompt(
  "Shows revenue trend chart and spotlight card",
  { timeout: 30_000 }
);
await expect(page.locator(".header"))
  .toMatchScreenshotPrompt("Nav with avatar and bell icon");
```

**Signature:** `expect(page|locator).toMatchScreenshotPrompt(prompt: string, options?: ScreenshotOptions)`

* Use for **dynamic** UIs; keep prompts specific; scope with elements (using locators) when possible.
* **Consider whether you need `fullPage: true`**: Ask yourself if the assertion requires content beyond the visible viewport (e.g., long scrollable lists, full page layout checks). If only viewport content matters, omit `fullPage: true` — it's faster and cheaper. Use it only when you genuinely need to capture content outside the browser window's visible area. 

## AI Extraction (visual → data)

```ts
const txt = await page.extract("List revenue, active users, and churn rate");
```

Typed with Zod:

```ts
import { z } from "zod";
const Metrics = z.object({ revenue: z.string(), activeUsers: z.number(), churnRate: z.number() });
const m = await page.extract("Return revenue (currency), active users, churn %", { schema: Metrics });
```

**Signatures:**

* `page.extract(prompt: string): Promise<string>`
* `page.extract<T extends z.AnyZodObject>(prompt, { schema: T }): Promise<z.output<T>>`

## AI Agent (autonomous workflows)

Use the `agent` fixture to execute complex, human-like workflows:

```ts
test("complex workflow", async ({ agent, page }) => {
  await page.goto("/orders");
  await agent.act("Find the first pending order and mark it as shipped", { page });
});

// Or create manually
const agent = context.newAgent();
await agent.act("Your task here", { page, maxCycles: 10 }); // split into smaller steps if possible
```

**Signature:** `agent.act(prompt: string, options: { page: Page, maxCycles?: number, model?: string }): Promise<{ success: boolean }>`

* Default maxCycles: 30
* Supported models: `anthropic/claude-sonnet-4-5-20250929` (default), `google/gemini-2.5-computer-use-preview-10-2025`

### Passing Variables to Prompts

You can use template literals to pass variables into your prompts:

```ts
const duration = 24 * 7 * 60;
await agent.act(`Enter the duration of ${duration} seconds`, { page });

const username = "[email protected]";
await agent.act(`Login with username ${username}`, { page });
```

### Self-Contained Prompts

All prompts to Stably SDK AI methods (agent.act, toMatchScreenshotPrompt, extract) must be self-contained with all necessary information:

1. **No implicit references to outside context** - prompts cannot reference previous actions or state that the AI method doesn't have access to:
   - ❌ Bad: `agent.act("Verify the field you just filled in the form is 4", { page })`
   - ✅ Good: `agent.act("Verify the 'timeout' field in the form has value 4", { page })`
   - ❌ Bad: `agent.act("Pick something that's not in the previous step", { page })`
   - ✅ Good: `const selectedItem = "Option A"; await agent.act(\`Pick an option other than ${selectedItem}\`, { page })`

2. **Pass information between AI methods using explicit variables:**
   ```ts
   // Extract data, then use it in next action
   const orderId = await page.extract("Get the order ID from the first row");
   await agent.act(`Cancel order with ID ${orderId}`, { page });
   ```

3. **Include detailed instructions and domain knowledge** to help the AI perform the task successfully:
   - ❌ Bad: `agent.act("Fill in the form", { page })`
   - ✅ Good: `agent.act("Fill in the form with test data. On page 4 you might run into a popup asking for premium features - just click 'Skip' or 'Cancel' to ignore it", { page })`

### Optimizing Agent Performance

**IMPORTANT:** The fewer actions/cycles agent.act() needs to do, the better it performs. Offload work to Playwright code when possible:

1. If your prompt has work that could be done by Playwright code, use Playwright for that work, and only use agent.act() for actions that are hard for Playwright (canvas operations, dynamic decision making, etc.)
2. If your prompt has repetition (e.g., do it 5 times), calculations (e.g., type 24*7*60 seconds), or other code-suitable tasks, use code for those, and only have agent.act() perform the agent-suitable part.
3. If your prompt has an if/else condition that can be expressed in code, use code for the condition, and only have agent.act() perform the agent-suitable part.

**Examples:**
- ❌ Bad: `"Click the button 5 times"` 
- ✅ Good: `"Click the button"` (and include this in a loop that runs 5 times)
- ❌ Bad: `"enter the duration of 24*7*60 seconds"` 
- ✅ Good: Calculate in code (`const sum = 24*7*60`), then use `\`enter the duration of ${sum} seconds\``

## CI Reporter / Cloud

```bash
npm install @stablyai/playwright-test
```

```ts
// playwright.config.ts
import { defineConfig } from "@playwright/test";
import { stablyReporter } from "@stablyai/playwright-test";

export default defineConfig({
  reporter: [
    ["list"],
    stablyReporter({ apiKey: process.env.STABLY_API_KEY, projectId: "YOUR_PROJECT_ID" }),
  ],
});
```

## Commands

```bash
# Recommended for Stably reporter + auto-heal
npx stably test

# Still supported (requires your reporter/config to be set up)
npx playwright test
# All Playwright CLI flags still work (headed, ui, project, file filters…)

# When running tests for debugging/getting stacktraces:
npx playwright test --reporter=list  # disable HTML reporter, shows terminal output directly
```

## Best Practices

* **CRITICAL: All locators must use the `.describe()` method** for readability in trace views and test reports. Example: `page.getByRole('button', { name: 'Submit' }).describe('Submit button')` or `page.locator('table tbody tr').first().describe('First table row')`
* Scope visual checks with locators; keep prompts specific with labels/units.
* Use `toHaveScreenshot` for stable pixel‑perfect UIs; `toMatchScreenshotPrompt` for dynamic UIs.
* **Be deliberate with `fullPage: true`**: Default to viewport-only screenshots. Only use `fullPage: true` when your assertion genuinely requires content beyond the visible viewport (e.g., verifying footer content on a long page, checking full scrollable lists). Viewport captures are faster and more cost-effective. 

## Troubleshooting

* **Slow assertions** → scope visuals; reduce viewport.
* **Agent stops early** → increase `maxCycles` or break task into smaller steps.

## Minimal Template

```ts
import { test, expect } from "@stablyai/playwright-test";

test("AI‑enhanced dashboard", async ({ page, agent }) => {
  await page.goto("/dashboard");
  
  // Use agent for complex workflows
  await agent.act("Navigate to settings and enable notifications", { page });
  
  // Use AI assertions for dynamic content
  await expect(page).toMatchScreenshotPrompt(
    "Dashboard shows revenue chart (>= 6 months) and account spotlight card"
  );
});
```

Add Rules to Your AI Assistant

Cursor
Claude Code

Add the AI rules through Cursor’s Rules feature:

Open Cursor Settings → Rules (or press Cmd/Ctrl + Shift + J)
Create a new stably-sdk-rules.mdc rule file
Paste the AI rules content
Configure when to apply the rule:
- Set appropriate file globs (e.g., **/*.spec.ts, **/tests/**/*.ts)
- Choose whether to always apply or apply based on file patterns

Cursor rules support project-specific and global configurations. You can create multiple rule files and control their scope using globs and the alwaysApply setting.

Cursor Commands (New Feature)Cursor now supports custom commands that can be triggered with / in the chat. Create reusable test generation workflows:

Create a .cursor/commands directory in your project root
Add a create-e2e-test.md file with your test generation instructions
Use /create-e2e-test in Cursor chat to generate tests instantly

The automated setup script (in the SDK Setup Guide) automatically creates a test generation command for you. You can also create team-wide commands in the Cursor Dashboard for your entire organization.

Option 1: Project-level configurationCreate a claude.md file in your project root:

# In your project root
touch claude.md

Paste the AI rules into this file. Claude will use these as custom instructions when working on your project.Option 2: Conversation instructionsStart each testing conversation by pasting the AI rules as context.

After adding rules to your AI assistant, restart it or start a new conversation to ensure the rules are loaded.

Verify Configuration

Test that your AI assistant understands Stably SDK by asking:

“Generate a test that uses Stably SDK’s AI assertion to verify a dashboard page”

The AI should generate code using toMatchScreenshotPrompt() instead of basic Playwright assertions.

Creating Tests with AI

Basic Workflow

Navigate to the Page

Use Playwright MCP to open the page you want to test:Ask the AI:

“Open a browser and navigate to https://app.example.com/dashboard”

The AI will use Playwright MCP to launch a browser and navigate to the page.

Explore the Page

Ask the AI to inspect elements and understand the page structure:

“What elements are visible on this page? Show me the main navigation and action buttons”

The AI uses MCP to capture page snapshots and identify interactive elements.

Generate Test Code

Describe the user flow you want to test:

“Generate a Stably SDK test that:

Logs into the app with test credentials

Clicks the ‘Create Project’ button

Fills in the project form

Uses an AI assertion to verify the success message appears”

The AI generates production-ready code using Stably SDK patterns.

Refine and Iterate

Review the generated test and refine as needed:

“Add error handling for network timeouts and use .describe() on the submit button locator”

The AI updates the test with improvements.

Example: Generated Test

Here’s an example of a test generated by an AI assistant with Stably SDK rules:

import { test, expect } from "@stablyai/playwright-test";

test("create project flow with AI validation", async ({ page }) => {
  // Navigate to the app
  await page.goto("https://app.example.com");
  
  // Login with test credentials
  await page.getByLabel("Email").fill("[email protected]");
  await page.getByLabel("Password").fill("TestPassword123");
  await page.getByRole("button", { name: "Sign In" })
    .describe("Login submit button")
    .click();
  
  // Wait for dashboard to load
  await expect(page).toHaveURL(/.*dashboard/);
  
  // Click create project button
  await page.getByRole("button", { name: "Create Project" })
    .describe("Main CTA to create new project")
    .click();
  
  // Fill project form
  await page.getByLabel("Project Name").fill("E2E Test Project");
  await page.getByLabel("Description").fill("Automated test project");
  await page.getByRole("combobox", { name: "Project Type" })
    .describe("Project type dropdown")
    .selectOption("Web Application");
  
  // Submit form
  await page.getByRole("button", { name: "Create" })
    .describe("Project creation submit button")
    .click();
  
  // Use AI assertion to verify success
  await expect(page).toMatchScreenshotPrompt(
    "Success message showing 'Project created successfully' with green checkmark icon",
    { timeout: 30_000 }
  );
  
  // Verify project appears in list
  await page.getByRole("link", { name: "Projects" })
    .describe("Navigation link to projects list")
    .click();
  
  await expect(page.getByRole("heading", { name: "E2E Test Project" }))
    .toBeVisible();
});

The AI automatically:

Uses .describe() on critical locators for auto-heal
Applies toMatchScreenshotPrompt() for dynamic UI validation
Includes proper waits and navigation checks
Follows Playwright best practices

Advanced Use Cases

Multi-Step User Flows

Generate complex, multi-page flows by describing the complete journey:

Generate a Stably SDK test for the complete checkout flow:
Browse product catalog and add 3 items to cart
Proceed to checkout
Fill shipping information form
Select payment method
Use AI assertion to verify order summary shows correct total
Complete purchase
Use AI extraction to get the order number
Verify confirmation page with order number

Data-Driven Tests

Generate tests that use extracted data for validation:

Create a test that:
Navigates to the analytics dashboard
Uses AI extraction to get revenue, active users, and churn rate from the page
Validates that revenue is greater than $10,000
Validates that churn rate is below 5%
Screenshots the trends chart if validation passes

Example generated code:

import { test, expect } from "@stablyai/playwright-test";
import { z } from "zod";

const MetricsSchema = z.object({
  revenue: z.number(),
  activeUsers: z.number(),
  churnRate: z.number()
});

test("validate analytics dashboard metrics", async ({ page }) => {
  await page.goto("/analytics/dashboard");
  
  // Extract metrics using AI
  const metrics = await page.extract(
    "Return revenue (as number), active users (as number), and churn rate (as percentage number)",
    { schema: MetricsSchema }
  );
  
  // Validate metrics
  expect(metrics.revenue).toBeGreaterThan(10000);
  expect(metrics.churnRate).toBeLessThan(5);
  
  console.log(`Validation passed:`, {
    revenue: `$${metrics.revenue.toLocaleString()}`,
    activeUsers: metrics.activeUsers.toLocaleString(),
    churnRate: `${metrics.churnRate}%`
  });
  
  // Verify the trends chart renders correctly
  await expect(page.locator(".trends-chart"))
    .toMatchScreenshotPrompt("Revenue trend chart showing last 6 months of data");
});

Visual Regression Testing

Generate comprehensive visual tests with AI assertions:

Create a visual regression test suite for the marketing landing page that:
1. Checks hero section with CTA button and value proposition
2. Validates features section shows all 6 feature cards
3. Verifies pricing table with 3 tiers
4. Checks footer has social links and newsletter signup
Use AI assertions for all checks to handle dynamic content

Best Practices

Be Specific in Your Prompts

The more detailed your description, the better the generated test. Include:

Exact labels and button text
Expected outcomes and error states
Data formats and validation rules
Whether to use AI assertions vs. standard assertions

Leverage MCP for Selector Discovery

Instead of manually finding selectors, ask the AI:

“What’s the best selector for the submit button on this form?”
“Show me all clickable elements in the header”
“Find the selector for the error message container”

MCP allows the AI to inspect the live page and suggest robust selectors.

Use AI Assertions Strategically

Guide the AI on when to use different assertion types:

Standard assertions (toBeVisible(), toHaveText()) for stable, predictable elements
AI assertions (toMatchScreenshotPrompt()) for dynamic content, personalized UIs, or complex layouts
AI extraction (page.extract()) when you need to validate computed values or extract data for later use

Iterate on Generated Tests

AI-generated tests are a starting point. Refine them by asking:

“Add error handling for network failures”
“Make the login reusable as a fixture”
“Add .describe() to locators that might break”
“Include comments explaining the test logic”

Review and Validate

Always review generated tests:

Run the test to verify it works
Check that locators are resilient (using getByRole, getByLabel, etc.)
Ensure proper wait conditions
Verify timeout values are reasonable
Check that API keys and secrets are not hardcoded

Troubleshooting

AI is generating basic Playwright code without Stably features

Solution: Ensure AI rules are properly configured. Try:

Verify stably-sdk-rules.mdc or claude.md file exists in project root
Restart your AI assistant to reload configuration
Explicitly mention in your prompt: “Use Stably SDK patterns with .describe() and toMatchScreenshotPrompt()”

Playwright MCP is not working

Solution:

Verify MCP configuration is correct in settings JSON
Restart your AI assistant
Check that @playwright/mcp@latest can be installed:
```
npx -y @playwright/mcp@latest --help
```
Look for MCP connection errors in your assistant’s console/logs

Generated selectors are too fragile

Solution: Ask the AI to use more semantic selectors:

“Add .describe() to all action locators”
“Use getByRole instead of CSS selectors”
“Prefer getByLabel for form inputs”
“Add test-id attributes to critical elements and use getByTestId”
“Use viewport screenshots instead of fullPage: true”

Tests are too slow

Solution: Optimize generated tests:

“Reduce timeout values where possible”
“Remove unnecessary waits”
“Use viewport screenshots instead of fullPage: true”
“Combine multiple toMatchScreenshotPrompt() in a single one where logical”

AI is overusing toMatchScreenshotPrompt()

Solution: Guide the AI on assertion selection:

“Use toMatchScreenshotPrompt() only for dynamic content that can’t be validated with standard assertions”
“Prefer toHaveText() and toBeVisible() for stable, predictable elements”
“Reserve AI assertions for visually complex validations”

Example Prompts

For Complete Test Suites

“Generate a complete test suite for our e-commerce site covering: product search, add to cart, checkout, and order confirmation. Use Stably SDK with AI assertions for the product grid and checkout summary. Include data extraction for order total validation.”

For Form Testing

“Create a Stably SDK test for the user registration form. Validate all fields, test error messages, and use an AI assertion to verify the success modal appears after submission. Add .describe() to all form input locators.”

For Visual Testing

“Generate visual regression tests for our component library. Test button variants, card layouts, and navigation menus. Use toMatchScreenshotPrompt() for each component and scope with locators.”

For API + UI Testing

“Write a test that creates a project via API, then verifies it appears correctly in the UI. Extract the project ID from the API response and use it to navigate to the project detail page. Use AI assertion to verify the project details render correctly.”

Getting started

Stably 2.0

Stably Classic

Troubleshooting

Create Tests with Cursor / Claude Code

Why Use AI Assistants for Test Creation?

Faster Development

Best Practices Built-in

Context-Aware

Reduced Errors

Prerequisites

Setup AI Rules for Stably SDK

Creating Tests with AI

Basic Workflow

Example: Generated Test

Advanced Use Cases

Multi-Step User Flows

Data-Driven Tests

Visual Regression Testing

Best Practices

Troubleshooting

Example Prompts

For Complete Test Suites

For Form Testing

For Visual Testing

For API + UI Testing

Next Steps

AI Assertions Guide

AI Extraction

Run Tests in CI

Additional Resources

Getting started

Stably 2.0

Stably Classic

Troubleshooting

​Why Use AI Assistants for Test Creation?

Faster Development

Best Practices Built-in

Context-Aware

Reduced Errors

​Prerequisites

​Setup AI Rules for Stably SDK

​Creating Tests with AI

​Basic Workflow

​Example: Generated Test

​Advanced Use Cases

​Multi-Step User Flows

​Data-Driven Tests

​Visual Regression Testing

​Best Practices

​Troubleshooting

​Example Prompts

​For Complete Test Suites

​For Form Testing

​For Visual Testing

​For API + UI Testing

​Next Steps

AI Assertions Guide

AI Extraction

Run Tests in CI

​Additional Resources

Why Use AI Assistants for Test Creation?

Prerequisites

Setup AI Rules for Stably SDK

Creating Tests with AI

Basic Workflow

Example: Generated Test

Advanced Use Cases

Multi-Step User Flows

Data-Driven Tests

Visual Regression Testing

Best Practices

Troubleshooting

Example Prompts

For Complete Test Suites

For Form Testing

For Visual Testing

For API + UI Testing

Next Steps

Additional Resources