Web Scraping

Extract clean content from any web page, crawl entire sites, or search and scrape in one step — no Firecrawl account, no browser infrastructure, just API calls.

Ways to call this

Reach scraping from a step through the typed ctx.sapiom.search client (scrape, webSearch) — sapiom_dev_agents_check validates the call at author time. See Using Capabilities.

Call sapiom_scrape, sapiom_crawl, sapiom_map, or sapiom_extract directly via the remote MCP — no agent to author. Run tool_discover to find tools by goal.

Wrap your HTTP client with @sapiom/fetch (or @sapiom/axios) and call the gateway directly — from anywhere, including inside a step. See the Quick Example below and the API reference for full parameters.

Quick Example

import { createFetch } from "@sapiom/fetch";

const sapiomFetch = createFetch({
  apiKey: process.env.SAPIOM_API_KEY,
  agentName: "my-agent",
});

// Scrape a page and get clean markdown
const response = await sapiomFetch(
  "https://firecrawl.services.sapiom.ai/v2/scrape",
  {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      url: "https://example.com/blog/article",
      formats: ["markdown"],
    }),
  }
);

const data = await response.json();
console.log(data.data.markdown);

How It Works

Sapiom routes scraping requests to Firecrawl, which handles browser rendering, content extraction, and anti-bot bypass. The SDK handles payment negotiation automatically — you pay per credit based on the operation.

The service supports several operations:

Scrape — Extract content from a single URL as markdown, HTML, or structured JSON
Map — Discover all URLs on a site by crawling its structure
Search — Search the web and scrape the results in one step
Crawl — Asynchronously crawl an entire site (up to 10,000 pages)
Extract — Asynchronously extract structured data from URLs using AI
Batch Scrape — Asynchronously scrape multiple URLs at once

Provider

Powered by Firecrawl. Firecrawl provides reliable web scraping with JavaScript rendering, anti-bot bypass, and clean content extraction.

API Reference

Endpoints

Base URL: https://firecrawl.services.sapiom.ai

Method	Path	Description	Pricing
POST	`/v2/scrape`	Scrape a single URL	1+ credits
POST	`/v2/map`	Map site structure	1 credit
POST	`/v2/search`	Search and scrape results	Dynamic
POST	`/v2/crawl`	Start async crawl	Dynamic (async)
GET	`/v2/crawl/:id`	Get crawl status (free)	Free
DELETE	`/v2/crawl/:id`	Cancel crawl (free)	Free
POST	`/v2/extract`	Start async extraction	Dynamic (async)
GET	`/v2/extract/:id`	Get extract status (free)	Free
POST	`/v2/batch/scrape`	Start async batch scrape	Dynamic (async)
GET	`/v2/batch/scrape/:id`	Get batch status (free)	Free

Scrape

Endpoint: POST https://firecrawl.services.sapiom.ai/v2/scrape

Extract content from a single URL. Returns immediately with the scraped content.

Request

Parameter	Type	Required	Description
`url`	string	Yes	URL to scrape
`formats`	string[]	No	Output formats: `markdown`, `html`, `json`
`proxy`	string	No	Proxy type: `enhanced`, `auto`, or `basic`

All other Firecrawl scrape parameters are also accepted and forwarded as-is.

{
  "url": "https://example.com/blog/post",
  "formats": ["markdown"]
}

Response

{
  "success": true,
  "data": {
    "markdown": "# Article Title\n\nArticle content...",
    "metadata": {
      "title": "Article Title",
      "description": "Article description",
      "sourceURL": "https://example.com/blog/post"
    }
  }
}

Pricing

Configuration	Credits	Cost
Base scrape	1	$0.009
+ Enhanced proxy	+4	$0.045
+ JSON extraction	+4	$0.045
Both addons	9	$0.081

Map

Endpoint: POST https://firecrawl.services.sapiom.ai/v2/map

Discover all URLs on a site by crawling its link structure. Useful for building a sitemap before crawling.

Request

Parameter	Type	Required	Description
`url`	string	Yes	URL to map

{
  "url": "https://example.com"
}

Pricing

Flat rate: 1 credit ($0.009).

Search

Endpoint: POST https://firecrawl.services.sapiom.ai/v2/search

Search the web and scrape the results in one step. Returns search results with full page content.

Request

Parameter	Type	Required	Description
`query`	string	Yes	Search query
`limit`	number	No	Max results (default: 5)
`scrapeOptions`	object	No	Per-result scrape options
`scrapeOptions.proxy`	string	No	Proxy type: `enhanced`, `auto`, `basic`
`scrapeOptions.formats`	string[]	No	Output formats: `markdown`, `html`, `json`

{
  "query": "TypeScript best practices 2026",
  "limit": 5
}

Pricing

Search pricing combines a base search fee plus per-result scraping:

searchCredits = ceil(limit / 10) * 2
perResultCredits = 1 + (enhanced proxy ? 4 : 0) + (json format ? 4 : 0)
totalCredits = searchCredits + (limit * perResultCredits)

Example (limit=5)	Credits	Cost
Base search	7	$0.063
+ Enhanced proxy	27	$0.243

Crawl (Async)

Endpoint: POST https://firecrawl.services.sapiom.ai/v2/crawl

Start an asynchronous crawl job. Returns a job ID — poll the status endpoint to get results.

Request

Parameter	Type	Required	Description
`url`	string	Yes	Starting URL to crawl
`limit`	number	No	Max pages to crawl (default: 10,000, max: 10,000)
`scrapeOptions`	object	No	Per-page scrape options
`scrapeOptions.proxy`	string	No	Proxy type
`scrapeOptions.formats`	string[]	No	Output formats

{
  "url": "https://docs.example.com",
  "limit": 50
}

Response

{
  "id": "crawl-abc123",
  "status": "started"
}

Check Crawl Status

Endpoint: GET https://firecrawl.services.sapiom.ai/v2/crawl/{id} (free)

{
  "status": "completed",
  "creditsUsed": 42,
  "data": [...]
}

Cancel Crawl

Endpoint: DELETE https://firecrawl.services.sapiom.ai/v2/crawl/{id} (free)

Pricing

You authorize the maximum cost upfront based on limit. You only pay for pages actually crawled (creditsUsed).

Limit	Base Credits	Cost (max)
10	10	$0.09
50	50	$0.45
100	100	$0.90

Enhanced proxy and JSON format add +4 credits per page each.

Extract (Async)

Endpoint: POST https://firecrawl.services.sapiom.ai/v2/extract

Start an asynchronous AI extraction job. Uses LLMs to extract structured data from URLs.

Request

Parameter	Type	Required	Description
`urls`	string[]	Yes	URLs to extract from
`prompt`	string	Yes	Extraction prompt or schema

Check Extract Status

Endpoint: GET https://firecrawl.services.sapiom.ai/v2/extract/{id} (free)

Pricing

Pricing is token-based. Estimated at ~334 credits per URL ($3.01). Final cost is based on actual tokens used.

Batch Scrape (Async)

Endpoint: POST https://firecrawl.services.sapiom.ai/v2/batch/scrape

Scrape multiple URLs asynchronously. Returns a job ID.

Request

Parameter	Type	Required	Description
`urls`	string[]	Yes	URLs to scrape
`proxy`	string	No	Proxy type
`formats`	string[]	No	Output formats

Check Batch Status

Endpoint: GET https://firecrawl.services.sapiom.ai/v2/batch/scrape/{id} (free)

Pricing

Same per-URL pricing as scrape: 1 credit base + addons per URL. You only pay for URLs actually scraped.

Error Codes

Code	Description
400	Invalid request — check URL and parameters
402	Payment required — ensure you’re using the Sapiom SDK
429	Rate limit exceeded
502	Upstream Firecrawl error

Complete Example

import { createFetch } from "@sapiom/fetch";

const sapiomFetch = createFetch({
  apiKey: process.env.SAPIOM_API_KEY,
  agentName: "my-agent",
});

const baseUrl = "https://firecrawl.services.sapiom.ai";

async function scrapeArticle(url: string) {
  // Scrape a single page
  const response = await sapiomFetch(`${baseUrl}/v2/scrape`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ url, formats: ["markdown"] }),
  });

  const data = await response.json();
  return data.data.markdown;
}

async function crawlDocs(siteUrl: string, maxPages: number) {
  // Start a crawl job
  const jobRes = await sapiomFetch(`${baseUrl}/v2/crawl`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ url: siteUrl, limit: maxPages }),
  });

  const job = await jobRes.json();
  console.log(`Crawl started: ${job.id}`);

  // Poll for completion
  let result;
  do {
    const statusRes = await sapiomFetch(`${baseUrl}/v2/crawl/${job.id}`);
    result = await statusRes.json();
  } while (result.status !== "completed" && result.status !== "failed");

  console.log(`Crawled ${result.creditsUsed} pages`);
  return result.data;
}

// Usage
const article = await scrapeArticle("https://blog.example.com/post");
console.log("Article:", article.substring(0, 200));

const docs = await crawlDocs("https://docs.example.com", 20);
console.log(`Got ${docs.length} pages`);

Pricing

All pricing is credit-based. 1 credit = $0.009.

Operation	Base Credits	Addons
Scrape	1	Enhanced proxy (+4), JSON format (+4)
Map	1	None
Search	ceil(limit/10)2 + limit1	Per-result: Enhanced proxy (+4), JSON (+4)
Crawl	limit * 1	Per-page: Enhanced proxy (+4), JSON (+4)
Extract	~334 per URL	None (token-based)
Batch Scrape	urls.length * 1	Per-URL: Enhanced proxy (+4), JSON (+4)
Status checks	Free	N/A

Async operations (crawl, extract, batch) authorize the maximum cost upfront but only charge for actual usage.