> ## Documentation Index
> Fetch the complete documentation index at: https://docs.stably.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Agent Execute

> Automate complex, multi-page browser workflows with autonomous AI agents in Playwright tests.

<Snippet file="ai-rules/install-sdk-rules.mdx" />

AI Agent Execute enables autonomous browser automation within your Playwright tests. Unlike atomic actions (`click`, `fill`, `select`), the agent handles complex, multi-step workflows across pages—allowing you to describe a high-level task and let the agent execute it end-to-end.

<Note>
  **Why use agents?** Agents excel at workflows where the exact path varies: searching, filtering, multi-page forms, checkout flows, and application processes. They complement Playwright's deterministic approach by providing flexibility when UI paths diverge or evolve.
</Note>

## Installation

The agent API is available through the Stably Playwright test integration:

```ts theme={null}
import { test, expect } from '@stablyai/playwright-test';
```

## Basic Usage

The agent is provided as a test fixture (or you can create one via `context.newAgent()` / `browser.newAgent()`). Use the `act()` method to execute high-level instructions:

```ts theme={null}
import { test, expect } from '@stablyai/playwright-test';

test('apply for a remote position', async ({ agent, page }) => {
  // Navigate to the starting page
  await page.goto('https://company.com/careers');

  // Execute a multi-step task that might span multiple tabs
  await agent.act(
    'Find remote senior software engineer roles, filter by full-time, and submit an application',
    { page, maxCycles: 25 }
  );

  console.log('Application submitted successfully');
});
```

## How It Works

`agent.act()` invokes an autonomous AI that:

1. **Observes** the current page state using visual context
2. **Plans** the next action based on your instruction and page content
3. **Executes** browser actions (click, type, navigate, etc.)
4. **Repeats** until the task completes or reaches the `maxCycles` limit
5. **Works across pages**: The agent can navigate between different pages and domains to complete the task

The agent uses computer-use-capable models to understand web interfaces visually and semantically, similar to how humans browse.

## Method Signature

```ts theme={null}
await agent.act(
  prompt: string,
  options: {
    page: Page;
    maxCycles?: number;
    model?: string;
  }
): Promise<{ success: boolean }>
```

### Parameters

* `prompt` (string, required) - The high-level task to accomplish
* `options` (object, required) - Agent configuration

**Options:**

* `page` (Page, required) - The starting page for the agent. The agent may create additional pages as needed
* `maxCycles` (number, optional) - Maximum number of thinking cycles the agent can perform. Each cycle includes observing the page, planning, and executing actions. Default: 30
* `model` (string, optional) - AI model to use for agent reasoning. Supported models: `anthropic/claude-sonnet-4-5-20250929`, `google/gemini-2.5-computer-use-preview-10-2025`. Default: `anthropic/claude-sonnet-4-5-20250929`

## Common Use Cases

### Multi-Step Forms

Agents handle complex, multi-page forms without explicit step-by-step instructions:

```ts theme={null}
await agent.act(
  'Complete the onboarding form: name is "Alex Chen", company is "Acme Corp", team size 50-100',
  { page, maxCycles: 15 }
);
```

### Search and Filter Workflows

Let the agent navigate dynamic search results and apply filters:

```ts theme={null}
await agent.act(
  'Find studio headphones under $500, sort by highest rating, and add the top result to cart',
  { page, maxCycles: 20 }
);
```

### Cross-Page Navigation

Agents can traverse multiple pages to complete tasks:

```ts theme={null}
await page.goto('https://docs.example.com');

await agent.act(
  'Navigate to the API reference section and find the authentication endpoint documentation',
  { page, maxCycles: 10 }
);
```

### Conditional Workflows

Handle branching paths without explicit conditionals:

```ts theme={null}
await agent.act(
  'Subscribe to the newsletter if there is a popup, otherwise proceed to checkout',
  { page, maxCycles: 12 }
);
```

### Set Up / Tear Down

Use agents to automate test setup and cleanup workflows:

```ts theme={null}
import { test, expect } from '@stablyai/playwright-test';

test.describe('E-commerce checkout flow', () => {
  test.beforeEach(async ({ agent, page }) => {
    // Navigate to the starting page
    await page.goto('https://shop.example.com');
    
    // Set up test data: add products to cart
    await agent.act(
      'Search for "wireless mouse", add the first result to cart, then search for "USB cable" and add it to cart',
      { page, maxCycles: 20 }
    );
  });

  test.afterEach(async ({ agent, page }) => {
    // Clean up: clear cart and sign out
    await agent.act(
      'Navigate to cart, remove all items, then sign out',
      { page, maxCycles: 15 }
    );
  });

  test('apply discount code', async ({ agent, page }) => {
    // Test assumes cart is already populated from beforeEach
    await agent.act(
      'Go to checkout, enter discount code "SAVE20", and verify discount is applied',
      { page, maxCycles: 10 }
    );
    
    await expect(page.locator('.discount-applied')).toBeVisible();
  });

  test('update shipping address', async ({ agent, page }) => {
    await agent.act(
      'Go to checkout, change shipping address to 123 Main St, New York, NY 10001',
      { page, maxCycles: 15 }
    );
    
    await expect(page.locator('.shipping-address')).toContainText('123 Main St');
  });
});
```

This approach is especially useful when:

* Manual setup involves multiple steps across different pages
* Setup data varies between test runs (dynamic content, user accounts)
* You want to reduce test code duplication

## Configuration

Control agent behavior using `maxCycles`:

```ts theme={null}
// Set maxCycles to prevent runaway execution
await agent.act(
  'Search for JavaScript frameworks and bookmark the top 3',
  { page, maxCycles: 15 } // Agent stops after 15 cycles if task isn't complete
);
```

For complex tasks requiring many steps, increase `maxCycles`:

```ts theme={null}
await agent.act(
  'Find and compare pricing plans across 3 SaaS products, then fill out a trial signup for the cheapest',
  { page, maxCycles: 40 } // Higher limit for complex workflows
);
```

Provide detailed instructions in the prompt itself:

```ts theme={null}
await agent.act(
  `Complete the medical insurance form with provided information. 
  Use conservative estimates when exact values are unavailable.
  Skip optional fields unless explicitly mentioned.`,
  { page }
);
```

## Multi-Page Capabilities

Unlike single-page actions, agents can navigate across pages and domains to complete tasks:

```ts theme={null}
test('research and compare products across vendors', async ({ agent, page }) => {
  await page.goto('https://search.example.com');

  await agent.act(
    `Search for "wireless keyboards", 
    visit the product pages of the top 3 results,
    and perform the checkout flow for the highest rated product`,
    { page, maxCycles: 30 }
  );
});
```

The agent tracks context across page navigations and maintains task progress throughout the journey.

## Creating Agents from Context or Browser

While the `agent` fixture is the most convenient way to use agents in tests, you can also create agents manually from a `BrowserContext` or `Browser`:

```ts theme={null}
// From context
test('manual agent from context', async ({ context, page }) => {
  const agent = context.newAgent();
  await agent.act('Complete the form', { page });
});

// From browser
test('manual agent from browser', async ({ browser, page }) => {
  const agent = browser.newAgent();
  await agent.act('Complete the form', { page });
});
```

<Note>
  Most tests should use the `agent` fixture for simplicity. Manual agent creation is useful when you need to operate on multiple Playwright `BrowserContext`s.
</Note>

## Best Practices

### Start on the Right Page

Navigate to a relevant starting point before executing tasks:

```ts theme={null}
// ✅ Good: Start on the relevant page
await page.goto('https://github.com/browserbase/stagehand');
await agent.act(
  'Get the latest merged PR on this repo',
  { page, maxCycles: 10 }
);
```

```ts theme={null}
// ❌ Avoid: Starting from an unrelated page
await page.goto('https://github.com');
await agent.act(
  'Get the latest PR on browserbase/stagehand',
  { page, maxCycles: 10 }
); // Agent wastes cycles navigating
```

### Be Specific with Instructions

Provide detailed, unambiguous instructions for better results:

```ts theme={null}
// ✅ Specific instruction
await agent.act(
  'Find Italian restaurants in Brooklyn open after 10pm with outdoor seating, sorted by rating',
  { page, maxCycles: 20 }
);
```

```ts theme={null}
// ❌ Vague instruction
await agent.act(
  'Find restaurants',
  { page, maxCycles: 20 }
); // Too ambiguous
```

### Break Down Very Complex Tasks

For extremely complex workflows, break into sequential agent executions:

```ts theme={null}
// First task: search and filter
await agent.act(
  'Search for MacBook Pro and filter by 16-inch models under $2500',
  { page, maxCycles: 15 }
);

// Second task: checkout
await agent.act(
  'Add the top result to cart and proceed to checkout',
  { page, maxCycles: 15 }
);
```

### Combine with Assertions

Use Playwright assertions to verify agent outcomes:

```ts theme={null}
await agent.act(
  'Subscribe to the Pro plan',
  { page, maxCycles: 20 }
);

// Verify the outcome
await expect(page.locator('.success-message')).toBeVisible();
await expect(page).toHaveURL(/checkout\/success/);
```

## Troubleshooting

### Agent Stops Before Completing Task

**Problem**: Agent stops before finishing the requested task

**Solutions**:

* Increase `maxCycles` for complex workflows (default is 30)
* Break very complex tasks into smaller sequential executions
* Ensure the starting page is relevant to reduce wasted navigation cycles

```ts theme={null}
// Increase maxCycles for complex tasks
await agent.act(
  'Complete the 5-page registration form with all required fields',
  { page, maxCycles: 40 } // Increased limit
);

// Or break into smaller tasks
await agent.act(
  'Complete pages 1-2 of the registration form',
  { page, maxCycles: 15 }
);

await agent.act(
  'Complete pages 3-5 of the registration form',
  { page, maxCycles: 20 }
);
```

### Agent Performs Unexpected Actions

**Problem**: Agent clicks wrong elements or takes unintended paths

**Solutions**:

* Make instructions more specific about what to click or avoid
* Provide domain-specific context in the prompt
* Start from a more focused page to reduce ambiguity
* Switch to another model

```ts theme={null}
// Provide clearer context in the prompt
await agent.act(
  `Apply to the first full-time senior software engineer role listed.
  You are searching for professional software engineering jobs.
  Ignore internships, contract roles, and non-engineering positions.`,
  { page, maxCycles: 20 }
);
```

### Task Times Out or Takes Too Long

**Problem**: Agent execution is slower than expected

**Solutions**:

* Ensure the starting page is close to the target workflow
* Reduce the scope of the task or break it into smaller pieces
* Check network conditions—slow page loads affect agent performance
* Switch to a faster model (e.g. `google/gemini-2.5-computer-use-preview-10-2025`)

### Task Execution Issues

**Problem**: Task fails or doesn't complete as expected

**Solutions**:

* Check if the UI changed mid-execution (dynamic content, popups)
* Increase `maxCycles` if the agent ran out of cycles
* Review the error message for specific failure points

```ts theme={null}
try {
  await agent.act(
    'Complete checkout',
    { page, maxCycles: 15 }
  );
} catch (error) {
  console.error('Agent execution failed:', error.message);
  // Example: "Could not find payment method selector after 12 cycles"
}
```

## When to Use Agents

* **Use `agent.act()`** when:
  * The workflow spans multiple pages or domains
  * The exact path varies based on dynamic content
  * You want to describe intent rather than prescribe steps
  * The UI is exploratory (search, browse, filter, compare)
* **Use standard Playwright actions** when:
  * The flow is deterministic and well-defined
  * Performance is critical (agents are slower than direct actions)
  * You need precise control over each step
  * The page structure is stable and predictable

Agents complement Playwright's deterministic approach—use them where flexibility and autonomy provide value.

## References

* Stably Playwright SDK: `@stablyai/playwright-test`
* Related: [AI Assertions](/stably-sdk/ai-assertions)
* Inspiration: [Stagehand Agent Documentation](https://docs.stagehand.dev/basics/agent)