Skip to main content
POST
/
ai-agents
/
agent-builder
/
knowledge-base
/
website
/
scrape
Scrape Website
curl --request POST \
  --url https://{appid}.api-{region}.cometchat.io/v3/ai-agents/agent-builder/knowledge-base/website/scrape \
  --header 'Content-Type: application/json' \
  --header 'apikey: <api-key>' \
  --data '
{
  "url": "https://example.com",
  "maxDepth": 3,
  "maxPages": 5
}
'
{
  "success": true,
  "message": "Website scrape completed successfully",
  "data": {
    "crawlId": "crawl-1701789123456",
    "crawlDuration": 120000,
    "status": "completed",
    "sitemap": {
      "found": true,
      "url": "https://docs.example.com/sitemap.xml",
      "totalUrls": 245,
      "urls": [
        "<string>"
      ],
      "lastModified": "2025-12-05T10:30:00Z"
    }
  }
}

Authorizations

apikey
string
header
required

API Key with fullAccess scope(i.e. Rest API Key from the Dashboard).

Body

application/json

Website crawling configuration

url
string
required

Target website URL to crawl

Example:

"https://docs.example.com"

maxDepth
number
default:3

Maximum depth to crawl from the starting URL

Required range: 1 <= x <= 10
Example:

5

maxPages
number
default:100

Maximum number of pages to crawl

Required range: 1 <= x <= 10000
Example:

500

include
string[]

URL patterns to include in crawling (substring matching)

Example:
["docs/", "api/", "guides/"]
exclude
string[]

URL patterns to exclude from crawling (substring matching)

Example:
["login", "signup", "admin", "privacy"]
fetchSitemap
boolean
default:false

Fetch and return sitemap URLs from the website

Example:

true

crawlerType
string
default:firecrawl

Crawler service to use for crawling (firecrawl, puppeteer, etc.)

Example:

"firecrawl"

Response

Website scraped successfully

success
boolean
Example:

true

message
string
Example:

"Website scrape completed successfully"

data
object