Connect Lobstr to Claude: Manage Crawlers and Automated Results
Learn how to connect Lobstr to Claude using a managed MCP server to orchestrate web scraping squids, tasks, and data exports directly from your AI agent.
If you need to connect Lobstr to Claude to automate web data extraction, manage scraping crawlers, and orchestrate automated result exports, you need a Model Context Protocol (MCP) server. This server acts as the translation layer between Claude's tool calls and Lobstr's REST APIs. You can either spend weeks building and maintaining this infrastructure yourself, or use a managed integration platform like Truto to dynamically generate a secure, authenticated MCP server URL.
If your team uses ChatGPT, check out our guide on connecting Lobstr to ChatGPT or explore our broader architectural overview on connecting Lobstr to AI Agents.
Giving a Large Language Model (LLM) read and write access to an asynchronous, credit-based execution platform like Lobstr is an engineering challenge. You have to map highly variable crawler input schemas to MCP tool definitions, deal with asynchronous polling logic, and safely handle strict usage limits. Every time Lobstr adds a new crawler or updates an execution state, you have to update your server code, redeploy, and test the integration. This guide breaks down exactly how to use Truto to generate a secure, managed MCP server for Lobstr, connect it natively to Claude, and execute complex scraping workflows using natural language.
Stop building custom API wrappers. Generate secure MCP servers for 100+ B2B apps in seconds. :::
The Engineering Reality of the Lobstr API
A custom MCP server is a self-hosted integration layer. While the open MCP standard provides a predictable way for models to discover tools, the reality of implementing it against Lobstr's APIs is complex. Lobstr is not a standard REST CRUD app - it is a job orchestration and execution platform.
If you decide to build a custom MCP server for Lobstr, you own the entire API lifecycle. Here are the specific challenges you will face:
The Asynchronous Execution Hierarchy
Lobstr relies on a strict operational hierarchy: Crawler -> Squid -> Task -> Run -> Result. An LLM cannot simply ask Lobstr to "scrape this URL." It must first identify the right Crawler, instantiate a Squid (the job container), queue Tasks (the target URLs), initiate a Run, and then poll the Run until it completes. Exposing this raw hierarchy to an LLM usually results in the model trying to skip steps - like requesting results before a run is finished. Your MCP server must explicitly define schemas that guide the LLM through this multi-step asynchronous dance.
Highly Variable Parameter Schemas
Every Lobstr crawler has a unique configuration schema. A LinkedIn profile scraper requires completely different inputs than an Amazon product scraper. The params object in Lobstr API requests is entirely dynamic. Hardcoding an OpenAPI spec for Lobstr is virtually impossible because the schema drifts depending on the specific crawler you use. A resilient MCP server needs to allow the LLM to query get_single_lobstr_crawler_param_by_id dynamically to figure out the required inputs before it attempts to build a task.
Strict Rate Limits and Polling
Because Lobstr runs are asynchronous, clients must poll the API to check execution status. Aggressive polling triggers Lobstr's rate limits. Factual note on rate limits: Truto does not retry, throttle, or apply backoff on rate limit errors. When the upstream Lobstr API returns an HTTP 429 Too Many Requests, Truto passes that error directly to the caller. Truto normalizes the upstream rate limit info into standardized headers (ratelimit-limit, ratelimit-remaining, ratelimit-reset) per the IETF spec. The Claude client or calling agent is fully responsible for intercepting the 429, reading the ratelimit-reset header, and implementing its own backoff and retry logic.
Instead of building this orchestration logic from scratch, you can use Truto. Truto exposes Lobstr's endpoints as meticulously documented, ready-to-use MCP tools, handling all the underlying HTTP boilerplate so Claude can focus on reasoning through the scraping lifecycle.
How to Generate a Lobstr MCP Server
Truto dynamically generates MCP tools based on the active API documentation for your Lobstr integration. Tools are generated on the fly during the tools/list JSON-RPC handshake - they are never cached or stale.
You can generate an MCP server for a connected Lobstr account using either the Truto UI or the API.
Method 1: Via the Truto UI
If you are setting this up for internal team use, the Truto dashboard is the fastest route.
- Navigate to the Integrated Accounts page in your Truto dashboard and select your connected Lobstr account.
- Click the MCP Servers tab.
- Click Create MCP Server.
- Configure your server filters (e.g., restrict to
readmethods or specific tags likerunsandsquids). - Copy the generated MCP server URL (e.g.,
https://api.truto.one/mcp/a1b2c3d4...).
Method 2: Via the Truto API
If you are building an AI agent product and need to programmatically provision Lobstr MCP servers for your end-users, you use the Truto REST API. The endpoint validates the integration, provisions a secure token backed by a distributed KV store, and schedules any necessary expiration alarms.
Execute a POST request to /integrated-account/:id/mcp:
curl -X POST https://api.truto.one/integrated-account/<lobstr_account_id>/mcp \
-H "Authorization: Bearer <YOUR_TRUTO_API_KEY>" \
-H "Content-Type: application/json" \
-d '{
"name": "Lobstr Web Scraping Agent",
"config": {
"methods": ["read", "write", "custom"]
}
}'The API returns a fully qualified, authenticated MCP server URL:
{
"id": "mcp_token_987654",
"name": "Lobstr Web Scraping Agent",
"config": { "methods": ["read", "write", "custom"] },
"expires_at": null,
"url": "https://api.truto.one/mcp/xyz789..."
}Connecting the MCP Server to Claude
Once you have your Truto MCP URL, you need to register it with Claude. Anthropic supports connecting remote MCP servers over Server-Sent Events (SSE) or stdio.
Method A: Via the Claude UI
If you are using the Claude desktop app or web interface (for Enterprise/Team plans), you can add the connector directly in the UI.
- Open Claude and navigate to Settings -> Integrations.
- Click Add MCP Server (or Add Custom Connector).
- Paste the Truto MCP URL you generated.
- Click Add.
Claude will immediately execute an initialize handshake, request the tools/list, and populate its context window with the available Lobstr capabilities.
Method B: Via Manual Config File
If you are running Claude Desktop locally and prefer file-based configuration, or if you are integrating via a framework like Cursor, you can update your claude_desktop_config.json file.
Because Truto provides a hosted SSE endpoint, you use the official MCP SSE transport module to connect:
{
"mcpServers": {
"lobstr_truto": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-sse",
"https://api.truto.one/mcp/xyz789..."
]
}
}
}Restart Claude Desktop. The agent is now wired directly into your Lobstr environment.
Hero Tools for Lobstr Automation
Truto exposes the entirety of the Lobstr API, but providing the LLM with the right context is key. Below are the highest-leverage "hero tools" generated by Truto that enable Claude to orchestrate the full execution lifecycle.
list_all_lobstr_crawlers
Before Claude can scrape anything, it needs to know what tools are available on the platform. This tool lists all available Lobstr crawlers, returning their IDs, names, and credit costs per row.
"I need to scrape some LinkedIn profiles. Can you list the available crawlers in my Lobstr account and find one that handles LinkedIn, and tell me how much it costs per row?"
get_single_lobstr_crawler_param_by_id
Because crawler inputs are highly dynamic, Claude must call this tool to fetch the exact schema required for a specific crawler before building a Squid. It returns the configurable input parameters separated into task and squid configuration objects.
"I found the LinkedIn profile scraper crawler. Fetch its parameter schema so we know exactly what input fields and JSON structure it requires before we build the task."
create_a_lobstr_squid
A Squid is the execution container for a scraping job. This tool instantiates a new Squid in Lobstr for a specific crawler ID, preparing it to receive tasks.
"Create a new Lobstr squid named 'Q3 Competitor Tracking' using the LinkedIn crawler ID we just looked up."
create_a_lobstr_task
Tasks represent the actual work - usually target URLs or search queries. This tool allows Claude to batch-upload URLs into the Squid. It returns an array of queued tasks and indicates if any duplicates were skipped.
"Add these five target LinkedIn URLs as tasks to the 'Q3 Competitor Tracking' squid we just created. Make sure they are formatted according to the crawler's parameter schema."
create_a_lobstr_run
Once a Squid is loaded with tasks, this tool triggers the actual scraping engine. It returns a run ID and the initial execution status. Claude needs to hold onto this run hash for polling.
"Start a run for the 'Q3 Competitor Tracking' squid. Give me the run ID so we can monitor its progress."
get_single_lobstr_run_stat_by_id
Because scraping takes time, Claude uses this tool to check real-time statistics for an active run. It returns the percentage done, total tasks processed, duration, ETA, and a boolean is_done flag.
"Check the status of the run we just started. If it isn't finished, tell me the ETA and how many tasks have successfully processed so far."
list_all_lobstr_results
Once a run is complete (is_done: true), Claude calls this tool to retrieve the actual scraped data. It returns an array of result rows containing the data payload extracted by the crawler.
"The run is finished. Fetch all the results from the squid and summarize the key findings from the scraped LinkedIn profiles."
For the complete tool inventory, including delivery configurations, webhooks, and account credential management, see the Truto Lobstr Integration Page.
Workflows in Action
When Claude has access to these MCP tools, it stops being a mere chat interface and becomes an autonomous data operations engineer. Here is how Claude handles complex Lobstr workflows in practice.
1. The End-to-End Autonomous Scrape
Marketing and growth teams frequently need to run ad-hoc data enrichment. Instead of logging into a UI, clicking through menus, and manually downloading CSVs, a user can instruct Claude to handle the entire asynchronous pipeline.
"I need to scrape data for these 10 company URLs using the standard domain enrichment crawler. Set up the job in Lobstr, execute it, wait for it to finish, and then give me a table of the output data."
Here is how Claude executes this multi-step orchestration:
- Claude calls
list_all_lobstr_crawlersto find the ID for the domain enrichment crawler. - Claude calls
get_single_lobstr_crawler_param_by_idto understand the exact JSON structure required for the URLs. - Claude calls
create_a_lobstr_squidto spin up the execution container. - Claude calls
create_a_lobstr_taskpassing the 10 URLs mapped to the required schema. - Claude calls
create_a_lobstr_runto initiate the scrape and extracts therun_hash. - Claude enters a polling loop, calling
get_single_lobstr_run_stat_by_id. (If the API returns a 429 rate limit error due to aggressive polling, Claude reads theratelimit-resetheader and backs off). - Once
is_doneis true, Claude callslist_all_lobstr_resultsto retrieve the payload. - Claude formats the raw JSON response into a clean Markdown table for the user.
sequenceDiagram
participant User as User
participant Claude as Claude Desktop
participant MCP as Truto MCP Server
participant Lobstr as Lobstr API
User->>Claude: "Scrape these 10 URLs..."
Claude->>MCP: Call list_all_lobstr_crawlers
MCP->>Lobstr: GET /crawlers
Lobstr-->>MCP: [Crawler List]
MCP-->>Claude: Return Crawler ID
Claude->>MCP: Call create_a_lobstr_squid
MCP->>Lobstr: POST /squids
Lobstr-->>MCP: {squid_id}
MCP-->>Claude: Return Squid ID
Claude->>MCP: Call create_a_lobstr_task
MCP->>Lobstr: POST /squids/{id}/tasks
Lobstr-->>MCP: {queued_tasks}
MCP-->>Claude: Confirm tasks added
Claude->>MCP: Call create_a_lobstr_run
MCP->>Lobstr: POST /runs
Lobstr-->>MCP: {run_hash, status: "running"}
MCP-->>Claude: Return Run ID
loop Polling
Claude->>MCP: Call get_single_lobstr_run_stat_by_id
MCP->>Lobstr: GET /runs/{hash}/stats
Lobstr-->>MCP: {is_done: false, percent: 50}
MCP-->>Claude: Status update
end
Claude->>MCP: Call list_all_lobstr_results
MCP->>Lobstr: GET /squids/{id}/results
Lobstr-->>MCP: [Scraped Data]
MCP-->>Claude: Return JSON results
Claude-->>User: Markdown table of data2. Credit Monitoring and Run Abort
Data Ops teams need to ensure that scraping jobs do not quietly drain account credits. Claude can audit active runs, check resource consumption, and kill runaway jobs.
"Check all my active Lobstr squids. If any run has consumed more than 500 credits but is less than 20% done, abort it immediately and tell me the run ID."
Claude processes this operational rule by bridging multiple endpoints:
- Claude calls
list_all_lobstr_squidsto get the user's active configurations. - For each squid, Claude calls
list_all_lobstr_runsto find active runs. - Claude calls
get_single_lobstr_run_stat_by_idto check thepercent_done. - Claude calls
get_single_lobstr_run_credit_by_idto check thetotal_creditsconsumed. - If Claude finds a run matching the criteria (e.g., 600 credits used, 15% done), it calls
create_a_lobstr_run_abortpassing therun_hash. - Claude reports back to the user with the aborted run details.
Security and Access Control
Giving an AI agent access to an execution platform that burns financial credits requires strict guardrails. Truto's MCP architecture provides native access controls at the server level, ensuring the model cannot perform unauthorized actions regardless of the user's prompt.
- Method Filtering: You can enforce Read-Only architectures. By passing
config: { methods: ["read"] }during MCP server creation, Truto will strip out allcreate,update, anddeletetools. Claude can check run statuses and list results, but it physically cannot start new squids or spend credits. - Tag Filtering: You can restrict the MCP server to specific functional domains. If you only want the agent to audit account health, you can filter tools to only include those tagged with
accountsorbilling, hiding the crawler and execution tools completely. - Double Authentication: By enabling
require_api_token_auth: true, the MCP server URL itself is no longer enough to execute tools. The connecting client (Claude) must pass a valid Truto API token in the header. This prevents unauthorized execution if the MCP URL is leaked in a config file. - Auto-Expiring Servers: If you are provisioning an agent for a temporary scraping project, you can set an
expires_attimestamp. Truto's underlying durable scheduling system will automatically purge the credentials and invalidate the server at the exact expiration time, leaving zero zombie access points.
Moving from Chat to Automation
Connecting Lobstr to Claude via an MCP server transitions your workflows from manual UI operations to intelligent, conversational automation. By utilizing Truto, you bypass the massive engineering overhead of translating Lobstr's asynchronous execution hierarchy and dynamic schemas into reliable LLM tools.
Instead of writing custom polling loops, tracking cursor pagination, and maintaining crawler schemas, your engineering team can focus on what matters: building superior AI agents that extract value from the web. Truto handles the API normalization; Claude handles the logic.
Stop wrestling with API schemas. Generate a production-ready Lobstr MCP server in seconds. :::
FAQ
- How do I connect Lobstr to Claude?
- You connect Lobstr to Claude by deploying a Model Context Protocol (MCP) server. Truto generates a managed, authenticated MCP URL for your Lobstr account, which you can paste into Claude Desktop's custom connectors or configure via the claude_desktop_config.json file.
- How does Truto handle Lobstr rate limits?
- Truto does not retry, throttle, or apply backoff on rate limit errors. When Lobstr returns an HTTP 429, Truto passes the error back to Claude and normalizes the rate limit info into standard headers (ratelimit-limit, ratelimit-remaining, ratelimit-reset). The client AI must handle its own retry logic.
- Can Claude wait for a Lobstr run to finish before getting results?
- Yes. Lobstr operates asynchronously. Claude will use the get_single_lobstr_run_stat_by_id tool to poll the status of an active run. Once the run's status returns as done, Claude can call list_all_lobstr_results to fetch the scraped data.