> ## Documentation Index
> Fetch the complete documentation index at: https://docs.stably.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Extraction

> Visually interpret pages and return structured data using Stably's Playwright SDK.

<Snippet file="ai-rules/install-sdk-rules.mdx" />

Stably's visual extractors turn rendered UI into machine-readable outputs. Instead of scraping brittle selectors, describe the data you need and the SDK captures the screen, reasons about it, and returns plain text or a typed object.

<Note>
  **Why use extraction instead of DOM parsing?** Modern UIs stream content, render charts on canvas, and personalize layouts. `extract()` works on the final pixels, so it succeeds even when markup is noisy or inaccessible—ideal for dashboards, PDFs, and media-heavy experiences.
</Note>

## Installation

Install the Stably Playwright test integration and import it in place of the Playwright test runner:

```ts theme={null}
import { test, expect } from '@stablyai/playwright-test';
```

The API surface stays the same as Playwright, but you gain AI helpers including `page.extract()` and `locator.extract()`.

## extract

Request natural-language or structured data from the currently visible page or a specific element. Extraction is fully visual—the SDK stabilizes the viewport, captures a screenshot, and routes it through Stably's multimodal model.

Both `page.extract()` and `locator.extract()` share the same signature—use the locator variant to scope extraction to a specific element.

### Basic Usage

```ts theme={null}
import { test } from '@stablyai/playwright-test';

test('summarize dashboard metrics', async ({ page }) => {
  await page.goto('/dashboard');

  const summary = await page.extract(
    'List the revenue, active users, and churn rate shown in the main metric cards.'
  );

  console.log(summary);
  // → "Revenue is $245,000 MTD, 18,432 active users, churn at 2.3%."
});
```

`extract()` resolves to a `string`. Treat it like any text response—log it, snapshot it, or pass it into downstream logic.

You can also extract from a specific element using `locator.extract()`:

```ts theme={null}
const price = await page.locator('.product-card').first().extract(
  'What is the price shown?'
);
```

### Structured Extraction with Zod

Provide a Zod schema to guarantee shape and types. The model fills the schema while the SDK validates the response locally before returning.

<Warning>
  Schema extraction requires **Zod v4**. Install it with `npm install zod@4` to enable structured extraction.
</Warning>

```ts theme={null}
import { z } from 'zod';
import { test, expect } from '@stablyai/playwright-test';

const MetricsSchema = z.object({
  revenue: z.string(),
  activeUsers: z.number(),
  churnRate: z.number(),
});

test('return typed metrics', async ({ page }) => {
  await page.goto('/dashboard');

  const metrics = await page.extract(
    'Return the revenue (string with currency), active user count, and churn rate percentage from the cards.',
    { schema: MetricsSchema }
  );

  expect(metrics.churnRate).toBeLessThan(5);
});
```

The return type is inferred from your schema, enabling full TypeScript support with autocompletion and compile-time checks.

### Method Signatures

```ts theme={null}
// Simple extraction (returns string)
await page.extract(prompt: string): Promise<string>;
await page.extract(prompt: string, options: { model?: AIModel }): Promise<string>;

// Structured extraction with schema
await page.extract<T extends z.AnyZodObject>(
  prompt: string,
  options: { schema: T; model?: AIModel }
): Promise<z.output<T>>;

// Also available on locators (same signatures)
await locator.extract(prompt: string): Promise<string>;
await locator.extract(prompt: string, options): Promise<string | z.output<T>>;
```

**Options:**

* `schema` - Zod schema to validate and type the extracted data
* `model` - AI model to use (see [Model Selection](#model-selection))

### How It Works

1. **Visual capture** – Stably waits for the page to stabilize and captures the requested region (full-page if necessary).
2. **Prompt grounding** – Your instruction plus optional schema are sent to the vision-language model.
3. **Response** – The method resolves with either the raw string answer or the parsed object.

Expect responses in a few seconds—the same latency profile as other Stably AI calls.

## Common Use Cases

* **Verify financial reports** – Extract totals from revenue dashboards rendered as canvas charts.
* **Read generated PDFs** – Capture invoice numbers, totals, and due dates from embedded PDF viewers.
* **Collect UI copy** – Pull disclaimers or policy text to snapshot legal updates.
* **Bridge to assertions** – Feed extracted values into Playwright `expect` checks or your own business logic.

```ts theme={null}
import { z } from 'zod';
import { test, expect } from '@stablyai/playwright-test';

test('validate invoice totals visually', async ({ page }) => {
  await page.goto('/billing/invoices/123');

  const invoice = await page.extract(
    'Return invoice number, customer name, and total due (numerical) from the invoice viewer.',
    {
      schema: z.object({
        invoiceNumber: z.string(),
        customer: z.string(),
        totalDue: z.number(),
      }),
    }
  );

  expect(invoice.totalDue).toBeGreaterThan(0);
});
```

## Model Selection

You can specify which AI model to use. If not specified, the backend default is used.

```ts theme={null}
// Simple extraction with model
const summary = await page.extract('List all visible metrics', {
  model: 'google/gemini-3-flash-preview'
});

// Structured extraction with model
const data = await page.extract('Extract invoice details', {
  model: 'google/gemini-3-pro-preview',
  schema: InvoiceSchema
});
```

| Model                           | Provider | Characteristics     |
| ------------------------------- | -------- | ------------------- |
| `google/gemini-3-flash-preview` | Google   | Fast, efficient     |
| `google/gemini-3.1-pro-preview` | Google   | Most capable        |
| `openai/o4-mini`                | OpenAI   | Efficient reasoning |

## Best Practices

* **Be explicit** – Mention the units, format, or section name you expect the model to read.
* **Focus the viewport** – Scroll to the relevant component or collapse secondary panels before calling `extract` to reduce noise.
* **Pre-stabilize** – Wait for charts, skeleton loaders, and animations to finish before calling `extract`.
* **Validate results** – Pair schema-based extraction with standard `expect` assertions or downstream checks.
* **Guard sensitive data** – Extraction reveals what is visually rendered; avoid prompts that might echo secrets.

## Troubleshooting

**Result is incomplete**

* Ensure the requested elements are visible (no additional scrolling or hidden tabs).
* Rephrase the prompt to reference distinctive labels or surrounding context.

**Schema validation fails**

* Inspect the thrown error to see the model's raw answer.
* Relax overly strict fields (e.g., accept strings and coerce to numbers) or clarify the prompt format.

**Latency feels high**

* Limit the visible region before calling `extract` (collapse sidebars, scroll the target content into view).
* Batch related fields into a single schema-driven request to minimize round trips.

## References

* Stably Playwright SDK: `@stablyai/playwright-test`
* Zod schemas: [https://zod.dev](https://zod.dev)
* Related feature: [AI Assertions](/stably-sdk/ai-assertions)
