Skip to main content
Stably’s visual extractors turn rendered UI into machine-readable outputs. Instead of scraping brittle selectors, describe the data you need and the SDK captures the screen, reasons about it, and returns plain text or a typed object.
Why use extraction instead of DOM parsing? Modern UIs stream content, render charts on canvas, and personalize layouts. page.extract() works on the final pixels, so it succeeds even when markup is noisy or inaccessible—ideal for dashboards, PDFs, and media-heavy experiences.

Installation

Install the Stably Playwright test integration and import it in place of the Playwright test runner:
import { test, expect } from '@stablyai/playwright-test';
The API surface stays the same as Playwright, but the page fixture gains AI helpers such as page.extract().

page.extract

Request natural-language or structured data from the currently visible page. Extraction is fully visual—the SDK stabilizes the viewport, captures a screenshot, and routes it through Stably’s multimodal model.

Basic Usage

import { test } from '@stablyai/playwright-test';

test('summarize dashboard metrics', async ({ page }) => {
  await page.goto('/dashboard');

  const summary = await page.extract(
    'List the revenue, active users, and churn rate shown in the main metric cards.'
  );

  console.log(summary);
  // → "Revenue is $245,000 MTD, 18,432 active users, churn at 2.3%."
});
page.extract() resolves to a string. Treat it like any text response—log it, snapshot it, or pass it into downstream logic.

Structured Extraction with Zod

Provide a Zod schema to guarantee shape and types. The model fills the schema while the SDK validates the response locally before returning.
import { z } from 'zod';
import { test, expect } from '@stablyai/playwright-test';

const MetricsSchema = z.object({
  revenue: z.string(),
  activeUsers: z.number(),
  churnRate: z.number(),
});

test('return typed metrics', async ({ page }) => {
  await page.goto('/dashboard');

  const metrics = await page.extract(
    'Return the revenue (string with currency), active user count, and churn rate percentage from the cards.',
    { schema: MetricsSchema }
  );

  expect(metrics.churnRate).toBeLessThan(5);
});
The return type is inferred from your schema, enabling full TypeScript support with autocompletion and compile-time checks.

Method Signatures

await page.extract(prompt: string): Promise<string>;

await page.extract<T extends z.AnyZodObject>(
  prompt: string,
  options: { schema: T }
): Promise<z.output<T>>;

What Happens Under the Hood

  1. Visual capture – Stably waits for the page to stabilize and captures the requested region (full-page if necessary).
  2. Prompt grounding – Your instruction plus optional schema are sent to the vision-language model. 3 Response – The method resolves with either the raw string answer or the parsed object.
Expect responses in a few seconds—the same latency profile as other Stably AI calls.

Common Use Cases

  • Verify financial reports – Extract totals from revenue dashboards rendered as canvas charts.
  • Read generated PDFs – Capture invoice numbers, totals, and due dates from embedded PDF viewers.
  • Collect UI copy – Pull disclaimers or policy text to snapshot legal updates.
  • Bridge to assertions – Feed extracted values into Playwright expect checks or your own business logic.
import { z } from 'zod';
import { test, expect } from '@stablyai/playwright-test';

test('validate invoice totals visually', async ({ page }) => {
  await page.goto('/billing/invoices/123');

  const invoice = await page.extract(
    'Return invoice number, customer name, and total due (numerical) from the invoice viewer.',
    {
      schema: z.object({
        invoiceNumber: z.string(),
        customer: z.string(),
        totalDue: z.number(),
      }),
    }
  );

  expect(invoice.totalDue).toBeGreaterThan(0);
});

Best Practices

  • Be explicit – Mention the units, format, or section name you expect the model to read.
  • Focus the viewport – Scroll to the relevant component or collapse secondary panels before calling extract to reduce noise.
  • Pre-stabilize – Wait for charts, skeleton loaders, and animations to finish before calling extract.
  • Validate results – Pair schema-based extraction with standard expect assertions or downstream checks.
  • Guard sensitive data – Extraction reveals what is visually rendered; avoid prompts that might echo secrets.

Troubleshooting

Result is incomplete
  • Ensure the requested elements are visible (no additional scrolling or hidden tabs).
  • Rephrase the prompt to reference distinctive labels or surrounding context.
Schema validation fails
  • Inspect the thrown error to see the model’s raw answer.
  • Relax overly strict fields (e.g., accept strings and coerce to numbers) or clarify the prompt format.
Latency feels high
  • Limit the visible region before calling extract (collapse sidebars, scroll the target content into view).
  • Batch related fields into a single schema-driven request to minimize round trips.

References

I