Quickstart

Create a session, navigate, and extract data in 3 API calls.

Prerequisites

  • A BabelWrap API key (free tier includes 500 actions / month)
  • One of: curl, Python 3.8+, or Node.js 18+
Tip: If you prefer MCP over REST, skip to the MCP Server guide instead. The MCP server wraps this same API so your agent can call it as native tools.

Step 1: Get an API Key

  1. Sign up for a free BabelWrap account.
  2. Open the API Keys page in your dashboard.
  3. Click Create Key and copy the key that starts with bw_.

Set it as an environment variable so the examples below can use it:

export BABELWRAP_API_KEY="bw_your_api_key_here"
set BABELWRAP_API_KEY=bw_your_api_key_here
Base URL: All API requests go to https://api.babelwrap.com/v1. Every request (except /v1/health) must include the header Authorization: Bearer YOUR_KEY.

Step 2: Create a Session and Navigate

A session is an isolated browser context with its own cookies, storage, and viewport. Create one, then navigate it to a URL.

# Create a new browser session
SID=$(curl -s -X POST https://api.babelwrap.com/v1/sessions \
  -H "Authorization: Bearer $BABELWRAP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{}' | jq -r .session_id)

echo "Session: $SID"

# Navigate to Hacker News
curl -s -X POST https://api.babelwrap.com/v1/sessions/$SID/navigate \
  -H "Authorization: Bearer $BABELWRAP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://news.ycombinator.com"}' | jq .
import httpx, os

API_KEY = os.environ["BABELWRAP_API_KEY"]
BASE    = "https://api.babelwrap.com/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# Create a new browser session
resp = httpx.post(f"{BASE}/sessions", headers=HEADERS, json={})
session_id = resp.json()["session_id"]
print("Session:", session_id)

# Navigate to Hacker News
resp = httpx.post(
    f"{BASE}/sessions/{session_id}/navigate",
    headers=HEADERS,
    json={"url": "https://news.ycombinator.com"},
)
print(resp.json()["snapshot"]["title"])
const API_KEY = process.env.BABELWRAP_API_KEY;
const BASE    = "https://api.babelwrap.com/v1";
const headers = {
  "Authorization": `Bearer ${API_KEY}`,
  "Content-Type":  "application/json",
};

// Create a new browser session
const { session_id } = await fetch(`${BASE}/sessions`, {
  method: "POST", headers, body: JSON.stringify({})
}).then(r => r.json());

console.log("Session:", session_id);

// Navigate to Hacker News
const nav = await fetch(`${BASE}/sessions/${session_id}/navigate`, {
  method: "POST", headers,
  body: JSON.stringify({ url: "https://news.ycombinator.com" })
}).then(r => r.json());

console.log(nav.snapshot.title);

The navigate response includes a snapshot of the page. Here is a trimmed example:

{
  "action_id": "act_abc123",
  "status": "completed",
  "snapshot": {
    "url": "https://news.ycombinator.com",
    "title": "Hacker News",
    "content": "Hacker News\nnew | past | comments | ask | show | jobs | submit ...",
    "actions": [
      { "id": "more-link", "label": "More", "type": "link" }
    ],
    "navigation": ["new", "past", "comments", "ask", "show", "jobs", "submit"],
    "forms": [],
    "inputs": [],
    "alerts": []
  }
}
Tip: The snapshot is designed to be fed directly to an LLM. It contains all the information an agent needs to decide what to do next -- without parsing HTML.

Step 3: Extract Data

The POST /sessions/:id/extract endpoint pulls structured data from the page using a natural language query. No CSS selectors needed.

# Extract the top stories
curl -s -X POST https://api.babelwrap.com/v1/sessions/$SID/extract \
  -H "Authorization: Bearer $BABELWRAP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "all story titles and their point counts"}' | jq .data
# Extract the top stories
resp = httpx.post(
    f"{BASE}/sessions/{session_id}/extract",
    headers=HEADERS,
    json={"query": "all story titles and their point counts"},
)
stories = resp.json()["data"]

for story in stories:
    print(f"  {story['points']} - {story['title']}")
// Extract the top stories
const { data: stories } = await fetch(
  `${BASE}/sessions/${session_id}/extract`, {
    method: "POST", headers,
    body: JSON.stringify({ query: "all story titles and their point counts" })
}).then(r => r.json());

stories.forEach(s => console.log(`  ${s.points} - ${s.title}`));

The extract response returns structured JSON in a data array:

{
  "action_id": "act_def456",
  "status": "completed",
  "data": [
    { "title": "Show HN: I built a tool that turns PDFs into podcasts", "points": 312 },
    { "title": "SQLite is not a toy database", "points": 287 },
    { "title": "The art of finishing projects", "points": 245 },
    { "title": "Why we moved from Kubernetes back to bare metal", "points": 198 },
    { "title": "A visual guide to SSH tunnels", "points": 176 }
  ]
}
Note: The shape of data is inferred from your query. Ask for "all product names, prices, and ratings" and you will get objects with name, price, and rating fields.

Full Working Example

Here is a complete, copy-paste-ready Python script that ties all three steps together. It creates a session, navigates, extracts data, and cleans up when done.

#!/usr/bin/env python3
"""BabelWrap Quickstart - extract Hacker News top stories."""

import asyncio
import os

import httpx

API_KEY = os.environ["BABELWRAP_API_KEY"]
BASE    = "https://api.babelwrap.com/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}


async def main():
    async with httpx.AsyncClient(timeout=30.0) as client:
        # 1. Create a session
        resp = await client.post(
            f"{BASE}/sessions", headers=HEADERS, json={}
        )
        resp.raise_for_status()
        session_id = resp.json()["session_id"]
        print(f"Session created: {session_id}")

        try:
            # 2. Navigate to Hacker News
            resp = await client.post(
                f"{BASE}/sessions/{session_id}/navigate",
                headers=HEADERS,
                json={"url": "https://news.ycombinator.com"},
            )
            resp.raise_for_status()
            title = resp.json()["snapshot"]["title"]
            print(f"Navigated to: {title}")

            # 3. Extract top stories
            resp = await client.post(
                f"{BASE}/sessions/{session_id}/extract",
                headers=HEADERS,
                json={"query": "all story titles and their point counts"},
            )
            resp.raise_for_status()
            stories = resp.json()["data"]

            print(f"\nTop {len(stories)} stories:")
            for i, story in enumerate(stories, 1):
                print(f"  {i}. {story['title']} ({story['points']} pts)")

        finally:
            # Always close the session to free resources
            await client.delete(
                f"{BASE}/sessions/{session_id}", headers=HEADERS
            )
            print("\nSession closed.")


if __name__ == "__main__":
    asyncio.run(main())

Run it:

# Install httpx if needed
pip install httpx

# Run the script
python quickstart.py

Expected output:

Session created: ses_a1b2c3d4
Navigated to: Hacker News

Top 5 stories:
  1. Show HN: I built a tool that turns PDFs into podcasts (312 pts)
  2. SQLite is not a toy database (287 pts)
  3. The art of finishing projects (245 pts)
  4. Why we moved from Kubernetes back to bare metal (198 pts)
  5. A visual guide to SSH tunnels (176 pts)

Session closed.
Tip: Always wrap your session work in a try/finally block (or use a context manager) so the session is closed even if something fails. Sessions time out automatically after 5 minutes of inactivity, but closing them explicitly frees resources immediately.

Next Steps

You have the basics down. Here is where to go from here: