MCP Service Integration – Firecrawl Web Scraping and Deep Research
Provides powerful web scraping, search, and deep research capabilities via the MCP protocol.
Firecrawl MCP Server accepts query text, utilizes Firecrawl's web scraping capabilities to return the most relevant web content, supports extracting text from various web pages, and converts it to Markdown/HTML format for processing and generation by LLMs (Large Language Models).
Environment Variables
- FIRECRAWL_API_KEY: Your Firecrawl API key.
- FIRECRAWL_API_URL (Optional): Custom API endpoint for self-hosted instances.
- FIRECRAWL_RETRY_MAX_ATTEMPTS (Optional): Maximum number of retry attempts (default: 3).
- FIRECRAWL_RETRY_INITIAL_DELAY (Optional): Initial delay for the first retry (default: 1000 milliseconds).
- FIRECRAWL_RETRY_MAX_DELAY (Optional): Maximum delay between retries (default: 10000 milliseconds).
- FIRECRAWL_RETRY_BACKOFF_FACTOR (Optional): Exponential backoff factor (default: 2).
Usage Instructions
Firecrawl MCP Server Reference Documentation
🗺️ Feature List
Tool Identifier | Function Description | Core Parameters |
---|---|---|
firecrawl_scrape | Scrape content from a single web page, supports JavaScript rendering, and returns cleaned text content. | url (web page URL), formats (return format, e.g., ["markdown"] ), waitFor (wait time in milliseconds), timeout (timeout in milliseconds), mobile (whether to use mobile view) |
firecrawl_map | Map websites to discover all indexed URLs. | url (website URL), search (optional search term), ignoreSitemap (whether to ignore sitemap.xml), includeSubdomains (whether to include subdomains), limit (maximum number of URLs) |
firecrawl_search | Search for specified content on web pages and return matching results. | query (query text), limit (number of results to return), lang (language), country (country), scrapeOptions (scraping options) |
firecrawl_crawl | Initiates asynchronous crawling and supports multi-page extraction. | url (website URL), excludePaths (excluded paths), includePaths (included paths), maxDepth (maximum depth), limit (maximum number of pages), allowExternalLinks (whether to allow external links), deduplicateSimilarURLs (whether to deduplicate similar URLs) |
firecrawl_check_crawl_status | Check crawler status. | id (crawler task ID) |
firecrawl_extract | Extract structured data from web pages. | urls (list of web page URLs), prompt (custom prompt for LLM extraction), systemPrompt (system prompt to guide the LLM), schema (JSON schema for structured data extraction), allowExternalLinks (allow extracting information from external links), enableWebSearch (enable web search to get additional context), includeSubdomains (include subdomains during extraction) |
firecrawl_deep_research | Performs in-depth multi-source research, providing summaries and sources. | query (query text), maxDepth (maximum depth), timeLimit (time limit in seconds), maxUrls (maximum number of URLs to analyze, default: 50) |
firecrawl_generate_llmstxt | Generate an LLMs.txt file for a specific domain | url (base URL of the website to analyze), maxUrls (maximum number of URLs to include, default: 10), showFullText (whether to include the content of llms-full.txt in the response) |
Repository URL
https://github.com/mendableai/firecrawl-mcp-server