Crawler Report for docs.crawl4ai.com

Summary

Website Quality Score

6.9 Fair
Performance
10.0
SEO
4.4
Security
6.5
Accessibility
5.0
Best Practices
9.2
  • ⛔ Skipped URLs - 29 skipped URLs found.
  • ⛔ 404 CRITICAL - 6 non-existent pages found.
  • ⛔ 5 page(s) with multiple <h1> headings.
  • ⛔ 10 page(s) without <h1> heading.
  • ⛔ Security - 207 pages(s) with critical finding(s).
  • ⚠️ The description '🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper' exceeds the allowed 10% duplicity. 92% of pages have this same description.
  • ⚠️ 63 page(s) do not support Brotli compression.
  • ⚠️ No WebP image found on the website.
  • ⚠️ No AVIF image found on the website.
  • ⚠️ 66 page(s) with skipped heading levels.
  • ⚠️ 61 page(s) without form labels.
  • ⚠️ 63 page(s) without aria labels.
  • ⚠️ 62 page(s) without role attributes.
  • ⏩ Loaded robots.txt for domain 'docs.crawl4ai.com': status code 404, size 29 kB and took 578 ms.
  • ⏩ External URLs - 29 external URL(s) found.
  • ⏩ Redirects - 2 redirect(s) found.
  • ⏩ DNS IPv6: domain docs.crawl4ai.com does not support IPv6 (DNS server: 127.0.0.53).
  • ✅ SSL/TLS certificate is valid until Jun 8 12:36:45 2026 GMT. Issued by C = US, O = Let's Encrypt, CN = E8. Subject is CN = crawl4ai.com.
  • ✅ SSL/TLS certificate issued by 'C = US, O = Let's Encrypt, CN = E8'.
  • ✅ Performance OK - all non-media URLs are faster than 3 seconds.
  • ✅ HTTP headers - found 7 unique headers.
  • ✅ All 62 unique title(s) are within the allowed 10% duplicity. Highest duplicity title has 3%.
  • ✅ All pages have quoted attributes.
  • ✅ All pages have inline SVGs smaller than 5120 bytes.
  • ✅ All pages have inline SVGs with less than 5 duplicates.
  • ✅ All pages have valid or none inline SVGs.
  • ✅ All pages have DOM depth less than 30.
  • ✅ All pages have clickable (interactive) phone numbers.
  • ✅ All pages have valid HTML.
  • ✅ All pages have image alt attributes.
  • ✅ All pages have lang attribute.
  • ✅ DNS IPv4 OK: domain docs.crawl4ai.com resolved to 35.163.245.47 (DNS server: 127.0.0.53).

Visited URLs

Found 71 row(s).
URLStatusTypeTime (s)SizeCache
/200 HTML180 ms42 kBETag-only
/advanced/lazy-loading/200 HTML179 ms43 kBETag-only
/marketplace/200 HTML179 ms6 kBETag-only
/advanced/file-downloading/200 HTML179 ms49 kBETag-only
/advanced/network-console-capture/200 HTML179 ms62 kBETag-only
/core/simple-crawling/200 HTML250 ms54 kBETag-only
/core/self-hosting/200 HTML360 ms372 kBETag-only
/core/deep-crawling/200 HTML236 ms159 kBETag-only
/core/page-interaction/200 HTML224 ms86 kBETag-only
/extraction/clustring-strategies/200 HTML180 ms57 kBETag-only
/core/crawler-result/200 HTML203 ms80 kBETag-only
/core/markdown-generation/200 HTML180 ms98 kBETag-only
/core/content-selection/200 HTML180 ms115 kBETag-only
/advanced/adaptive-strategies/200 HTML179 ms77 kBETag-only
/blog/200 HTML179 ms36 kBETag-only
/api/arun/200 HTML179 ms65 kBETag-only
/api/strategies/200 HTML179 ms91 kBETag-only
/advanced/hooks-auth/200 HTML179 ms67 kBETag-only
/advanced/advanced-features/200 HTML180 ms95 kBETag-only
/core/c4a-script/200 HTML180 ms60 kBETag-only
/core/installation/200 HTML179 ms40 kBETag-only
/api/crawl-result/200 HTML180 ms95 kBETag-only
/advanced/proxy-security/200 HTML180 ms78 kBETag-only
/api/parameters/200 HTML180 ms105 kBETag-only
/core/browser-crawler-config/200 HTML180 ms87 kBETag-only
/core/cli/200 HTML179 ms61 kBETag-only
/advanced/session-management/200 HTML180 ms74 kBETag-only
/api/arun_many/200 HTML179 ms55 kBETag-only
/advanced/multi-url-crawling/200 HTML180 ms107 kBETag-only
/branding/200 HTML180 ms81 kBETag-only
/advanced/crawl-dispatcher/200 HTML179 ms31 kBETag-only
/extraction/llm-strategies/200 HTML180 ms73 kBETag-only
/core/ask-ai/200 HTML179 ms32 kBETag-only
/apps/llmtxt/200 HTML179 ms6 kBETag-only
/marketplace/admin/200 HTML179 ms10 kBETag-only
/core/url-seeding/200 HTML181 ms220 kBETag-only
/core/fit-markdown/200 HTML179 ms59 kBETag-only
/core/local-files/200 HTML179 ms60 kBETag-only
/advanced/virtual-scroll/200 HTML179 ms69 kBETag-only
/advanced/ssl-certificate/200 HTML179 ms49 kBETag-only
/extraction/no-llm-strategies/200 HTML181 ms163 kBETag-only
/extraction/chunking/200 HTML179 ms55 kBETag-only
/api/async-webcrawler/200 HTML180 ms75 kBETag-only
/core/adaptive-crawling/200 HTML179 ms69 kBETag-only
/api/c4a-script-reference/200 HTML180 ms79 kBETag-only
/advanced/anti-bot-and-fallback/200 HTML180 ms67 kBETag-only
/core/link-media/200 HTML180 ms142 kBETag-only
/apps/c4a-script/200 HTML179 ms9 kBETag-only
/CONTRIBUTING/200 HTML179 ms38 kBETag-only
/advanced/undetected-browser/200 HTML180 ms78 kBETag-only
/advanced/identity-based-crawling/200 HTML180 ms80 kBETag-only
/core/examples/200 HTML179 ms47 kBETag-only
/apps/200 HTML179 ms38 kBETag-only
/stats/200 HTML179 ms48 kBETag-only
/advanced/pdf-parsing/200 HTML180 ms65 kBETag-only
/core/cache-modes/200 HTML179 ms39 kBETag-only
/core/quickstart/200 HTML180 ms101 kBETag-only
/blog/articles/llm-context-revolution/200 HTML179 ms47 kBETag-only
/blog/articles/adaptive-crawling-revolution/200 HTML179 ms58 kBETag-only
/c4a-script/demo404 HTML179 ms29 kBETag-only
/examples/c4a_script/404 HTML179 ms29 kBETag-only
/examples/c4a_script/tutorial/404 HTML179 ms29 kBETag-only
/docs/md_v2/apps/404 HTML179 ms29 kBETag-only
/docs/md_v2/assets/404 HTML179 ms29 kBETag-only
/api/parameters301 Redirect179 ms147 BNone
/blog/articles/llm-context-revolution301 Redirect179 ms191 BNone
/api/adaptive-crawler/200 HTML180 ms54 kBETag-only
/api/examples/c4a_script/tutorial/404 HTML179 ms29 kBETag-only
/apps/crawl4ai-assistant/200 HTML179 ms51 kBETag-only
/core/llmtxt/200 HTML179 ms31 kBETag-only
/api/digest/200 HTML179 ms49 kBETag-only
No rows found, please edit your search term.

Best practices

Analysis nameOKNoticeWarningCritical
DOM depth (> 30)69000
Heading structure610685
Title uniqueness (> 10%)62000
Description uniqueness (> 10%)1010
Brotli support00630
WebP support0010
AVIF support0010

Large inline SVGs

No problems found.


Duplicate inline SVGs

No problems found.


Invalid inline SVGs

No problems found.


Missing quotes on attributes

No problems found.


DOM depth

No problems found.


Heading structure

Found 10 row(s).
SeverityOccursDetailAffected URLs (max 5)
critical11Multiple <h1> headings found.URL 1, URL 2, URL 3, URL 4, URL 5
critical10No <h1> tag found in the HTML content.URL 1, URL 2, URL 3, URL 4, URL 5
warning47Heading structure is skipping levels: found an <h5> after an <h2>.URL 1, URL 2, URL 3, URL 4, URL 5
warning8Heading structure is skipping levels: found an <h5> without a previous higher heading.URL 1, URL 2, URL 3, URL 4, URL 5
warning6Heading structure is skipping levels: found an <h5> after an <h3>.URL 1, URL 2, URL 3, URL 4, URL 5
warning3Heading structure is skipping levels: found an <h4> after an <h2>.URL 1, URL 2, URL 3
warning3Heading structure is skipping levels: found an <h4> after an <h1>.URL 1, URL 2, URL 3
warning2Heading structure is skipping levels: found an <h3> after an <h1>.URL 1, URL 2
warning2Heading structure is skipping levels: found an <h2> without a previous higher heading.URL 1, URL 2
warning2Heading structure is skipping levels: found an <h5> after an <h1>.URL 1, URL 2
No rows found, please edit your search term.

Non-clickable phone numbers

No problems found.


Title uniqueness

No problems found.


Description uniqueness

No problems found.

Accessibility

Analysis nameOKNoticeWarningCritical
Missing aria labels10298014
Missing html lang attribute1000
Missing form labels0070
Missing image alt attributes17000
Missing roles0080

Valid HTML

No problems found.


Missing image alt attributes

No problems found.


Missing form labels

SeverityOccursDetailAffected URLs (max 5)
warning58<input class="form-*" id="mkdocs-search-query" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning1<input id="password" *** >/marketplace/admin/
warning1<input class="search-*" id="apps-search" *** >/marketplace/admin/
warning1<input id="edit-value" *** >/apps/c4a-script/
warning1<input id="edit-selector" *** >/apps/c4a-script/
warning1<input id="search-input" *** >/marketplace/
warning1<input class="search-*" id="articles-search" *** >/marketplace/admin/

Missing aria labels

Found 186 row(s).
SeverityOccursDetailAffected URLs (max 5)
critical1<select class="filter-*" id="apps-filter">/marketplace/admin/
critical1<input class="search-*" id="apps-search" *** >/marketplace/admin/
critical1<select id="useCase" name="useCase">/apps/crawl4ai-assistant/
critical1<textarea id="c4a-editor" *** >/apps/c4a-script/
critical1<input id="userName" name="name" *** >/apps/crawl4ai-assistant/
critical1<input id="userEmail" name="email" *** >/apps/crawl4ai-assistant/
critical1<input id="edit-value" *** >/apps/c4a-script/
critical1<select id="edit-command-type" *** >/apps/c4a-script/
critical1<select class="mini-*" id="type-filter">/marketplace/
critical1<input id="search-input" *** >/marketplace/
critical1<input id="userCompany" name="company" *** >/apps/crawl4ai-assistant/
critical1<input id="edit-selector" *** >/apps/c4a-script/
critical1<input id="password" *** >/marketplace/admin/
critical1<input class="search-*" id="articles-search" *** >/marketplace/admin/
critical1<select id="edit-direction">/apps/c4a-script/
warning3427<a class="terminal-*" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning957<a id="__codelineno-0-***" name="__codelineno-0-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning860<a id="__codelineno-1-***" name="__codelineno-1-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning745<a ***>URL 1, URL 2, URL 3, URL 4, URL 5
warning717<a id="__codelineno-3-***" name="__codelineno-3-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning686<a id="__codelineno-4-***" name="__codelineno-4-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning637<a id="__codelineno-5-***" name="__codelineno-5-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning607<a id="__codelineno-2-***" name="__codelineno-2-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning579<a id="__codelineno-8-***" name="__codelineno-8-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning544<a id="__codelineno-6-***" name="__codelineno-6-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning400<a class="menu-*" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning398<a id="__codelineno-11-***" name="__codelineno-11-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning336<a id="__codelineno-7-***" name="__codelineno-7-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning299<a id="__codelineno-10-***" name="__codelineno-10-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning286<a id="__codelineno-9-***" name="__codelineno-9-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning266<a id="__codelineno-24-***" name="__codelineno-24-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning201<a id="__codelineno-13-***" name="__codelineno-13-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning196<a id="__codelineno-12-***" name="__codelineno-12-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning193<a id="__codelineno-14-***" name="__codelineno-14-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning165<a id="__codelineno-15-***" name="__codelineno-15-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning131<a id="__codelineno-16-***" name="__codelineno-16-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning116<a id="__codelineno-19-***" name="__codelineno-19-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning112<a id="__codelineno-22-***" name="__codelineno-22-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning91<a id="__codelineno-17-***" name="__codelineno-17-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning89<a id="__codelineno-18-***" name="__codelineno-18-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning82<a id="__codelineno-71-***" name="__codelineno-71-***" *** >URL 1, URL 2
warning78<a id="__codelineno-20-***" name="__codelineno-20-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning74<a id="__codelineno-21-***" name="__codelineno-21-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning69<a id="__codelineno-38-***" name="__codelineno-38-***" *** >URL 1, URL 2
warning68<a id="__codelineno-23-***" name="__codelineno-23-***" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning67<a id="__codelineno-43-***" name="__codelineno-43-***" *** >URL 1, URL 2
warning66<a id="__codelineno-44-***" name="__codelineno-44-***" *** >URL 1, URL 2
warning63<a id="__codelineno-105-***" name="__codelineno-105-***" *** >/core/self-hosting/
warning62<a id="__codelineno-57-***" name="__codelineno-57-***" *** >URL 1, URL 2
warning58<a class="no-*" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning58<button class="close btn btn-* btn-*" *** >URL 1, URL 2, URL 3, URL 4, URL 5
warning55<a id="__codelineno-32-***" name="__codelineno-32-***" *** >URL 1, URL 2, URL 3
warning52<a id="__codelineno-77-***" name="__codelineno-77-***" *** >/core/self-hosting/
warning50<a id="__codelineno-33-***" name="__codelineno-33-***" *** >URL 1, URL 2, URL 3
warning46<a id="__codelineno-26-***" name="__codelineno-26-***" *** >URL 1, URL 2, URL 3, URL 4
warning44<a id="__codelineno-72-***" name="__codelineno-72-***" *** >URL 1, URL 2
warning42<a id="__codelineno-36-***" name="__codelineno-36-***" *** >URL 1, URL 2
warning40<a id="__codelineno-31-***" name="__codelineno-31-***" *** >URL 1, URL 2, URL 3
warning37<a id="__codelineno-58-***" name="__codelineno-58-***" *** >URL 1, URL 2
warning37<a id="__codelineno-35-***" name="__codelineno-35-***" *** >URL 1, URL 2
warning35<a id="__codelineno-75-***" name="__codelineno-75-***" *** >/core/self-hosting/
warning31<a id="__codelineno-42-***" name="__codelineno-42-***" *** >URL 1, URL 2
warning31<a id="__codelineno-37-***" name="__codelineno-37-***" *** >URL 1, URL 2
warning31<a id="__codelineno-59-***" name="__codelineno-59-***" *** >URL 1, URL 2
warning30<a id="__codelineno-29-***" name="__codelineno-29-***" *** >URL 1, URL 2, URL 3
warning29<a id="__codelineno-34-***" name="__codelineno-34-***" *** >URL 1, URL 2, URL 3
warning29<a id="__codelineno-91-***" name="__codelineno-91-***" *** >/core/self-hosting/
warning28<a id="__codelineno-68-***" name="__codelineno-68-***" *** >URL 1, URL 2
warning28<a id="__codelineno-41-***" name="__codelineno-41-***" *** >URL 1, URL 2
warning27<a id="__codelineno-76-***" name="__codelineno-76-***" *** >/core/self-hosting/
warning27<a id="__codelineno-83-***" name="__codelineno-83-***" *** >/core/self-hosting/
warning24<a id="__codelineno-30-***" name="__codelineno-30-***" *** >URL 1, URL 2, URL 3
warning23<a id="__codelineno-73-***" name="__codelineno-73-***" *** >/core/self-hosting/
warning23<a id="__codelineno-25-***" name="__codelineno-25-***" *** >URL 1, URL 2, URL 3, URL 4
warning22<a id="__codelineno-101-***" name="__codelineno-101-***" *** >/core/self-hosting/
warning20<a id="__codelineno-46-***" name="__codelineno-46-***" *** >URL 1, URL 2
warning19<a id="__codelineno-80-***" name="__codelineno-80-***" *** >/core/self-hosting/
warning19<a id="__codelineno-55-***" name="__codelineno-55-***" *** >URL 1, URL 2
warning18<a id="__codelineno-100-***" name="__codelineno-100-***" *** >/core/self-hosting/
warning17<a id="__codelineno-61-***" name="__codelineno-61-***" *** >URL 1, URL 2
warning17<a id="__codelineno-102-***" name="__codelineno-102-***" *** >/core/self-hosting/
warning17<a id="__codelineno-27-***" name="__codelineno-27-***" *** >URL 1, URL 2, URL 3
warning17<a id="__codelineno-70-***" name="__codelineno-70-***" *** >URL 1, URL 2
warning17<a id="__codelineno-49-***" name="__codelineno-49-***" *** >URL 1, URL 2
warning16<a id="__codelineno-92-***" name="__codelineno-92-***" *** >/core/self-hosting/
warning16<a id="__codelineno-85-***" name="__codelineno-85-***" *** >/core/self-hosting/
warning15<a id="__codelineno-63-***" name="__codelineno-63-***" *** >URL 1, URL 2
warning14<a id="__codelineno-69-***" name="__codelineno-69-***" *** >URL 1, URL 2
warning13<a id="__codelineno-28-***" name="__codelineno-28-***" *** >URL 1, URL 2, URL 3
warning13<a id="__codelineno-54-***" name="__codelineno-54-***" *** >URL 1, URL 2
warning12<a id="__codelineno-67-***" name="__codelineno-67-***" *** >URL 1, URL 2
warning12<a id="__codelineno-40-***" name="__codelineno-40-***" *** >URL 1, URL 2
warning11<button class="copy-*" *** >URL 1, URL 2
warning11<a id="__codelineno-53-***" name="__codelineno-53-***" *** >URL 1, URL 2
warning11<a id="__codelineno-60-***" name="__codelineno-60-***" *** >URL 1, URL 2
warning11<a id="__codelineno-74-***" name="__codelineno-74-***" *** >/core/self-hosting/
warning10<a id="__codelineno-39-***" name="__codelineno-39-***" *** >URL 1, URL 2
warning10<a id="__codelineno-107-***" name="__codelineno-107-***" *** >/core/self-hosting/
warning10<a id="__codelineno-64-***" name="__codelineno-64-***" *** >URL 1, URL 2
warning9<a id="__codelineno-56-***" name="__codelineno-56-***" *** >URL 1, URL 2
You have reached the limit of 100 rows as a protection against very large output or exhausted memory.
No rows found, please edit your search term.

Missing roles

Found 14 row(s).
SeverityOccursDetailAffected URLs (max 5)
warning113<nav>URL 1, URL 2, URL 3, URL 4, URL 5
warning58<main id="terminal-mkdocs-main-content">URL 1, URL 2, URL 3, URL 4, URL 5
warning58<nav class="terminal-*">URL 1, URL 2, URL 3, URL 4, URL 5
warning58<aside id="terminal-mkdocs-side-panel">URL 1, URL 2, URL 3, URL 4, URL 5
warning58<header class="terminal-*">URL 1, URL 2, URL 3, URL 4, URL 5
warning2<nav class="nav-*">URL 1, URL 2
warning1<aside class="admin-*">/marketplace/admin/
warning1<header class="admin-*">/marketplace/admin/
warning1<header class="marketplace-*">/marketplace/
warning1<main class="magazine-*">/marketplace/
warning1<footer class="marketplace-*">/marketplace/
warning1<footer class="footer">/apps/crawl4ai-assistant/
warning1<nav class="sidebar-*">/marketplace/admin/
warning1<main class="admin-*">/marketplace/admin/
No rows found, please edit your search term.

Missing html lang attribute

No problems found.

Security

HeaderOKNoticeWarningCriticalRecommendation
Strict-Transport-Security00069Strict-Transport-Security header is not set. It enforces secure connections and protects against MITM attacks.
Content-Security-Policy00069Content-Security-Policy header is not set. It restricts resources the page can load and prevents XSS attacks.
Server00069Server header is set to 'nginx/1.24.0 (Ubuntu)'. It is better not to reveal the technologies used and especially their versions.
X-Frame-Options00690X-Frame-Options header is not set. It prevents clickjacking attacks when set to 'deny' or 'sameorigin.
X-Content-Type-Options00690X-Content-Type-Options header is not set. It stops MIME type sniffing and mitigates content type attacks.
Referrer-Policy00690Referrer-Policy header is not set. It controls referrer header sharing and enhances privacy and security.
Feature-Policy00690Feature-Policy header is not set. It allows enabling/disabling browser APIs and features for security. Not important if Permissions-Policy is set.
Permissions-Policy00690Permissions-Policy header is not set. It allows enabling/disabling browser APIs and features for security.
X-XSS-Protection69000

Security headers

SeverityOccursDetailAffected URLs (max 5)
critical69Strict-Transport-Security header is not set. It enforces secure connections and protects against MITM attacks.URL 1, URL 2, URL 3, URL 4, URL 5
critical69Content-Security-Policy header is not set. It restricts resources the page can load and prevents XSS attacks.URL 1, URL 2, URL 3, URL 4, URL 5
critical69Server header is set to 'nginx/1.24.0 (Ubuntu)'. It is better not to reveal the technologies used and especially their versions.URL 1, URL 2, URL 3, URL 4, URL 5
warning69X-Frame-Options header is not set. It prevents clickjacking attacks when set to 'deny' or 'sameorigin.URL 1, URL 2, URL 3, URL 4, URL 5
warning69Permissions-Policy header is not set. It allows enabling/disabling browser APIs and features for security.URL 1, URL 2, URL 3, URL 4, URL 5
warning69X-Content-Type-Options header is not set. It stops MIME type sniffing and mitigates content type attacks.URL 1, URL 2, URL 3, URL 4, URL 5
warning69Referrer-Policy header is not set. It controls referrer header sharing and enhances privacy and security.URL 1, URL 2, URL 3, URL 4, URL 5
warning69Feature-Policy header is not set. It allows enabling/disabling browser APIs and features for security. Not important if Permissions-Policy is set.URL 1, URL 2, URL 3, URL 4, URL 5

TOP non-unique titles

Count 🔽Title
2Browser, Crawler & LLM Config - Crawl4AI Documentation (v0.8.x)

TOP non-unique descriptions

Count 🔽Description
58🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
5

SEO metadata

Found 63 row(s).
URL 🔼IndexingTitleH1DescriptionKeywords
/AllowedHome - Crawl4AI Documentation (v0.8.x)🚀🤖 Crawl4AI: Open-Source LLM-Friendly Web Crawler & Scraper🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/CONTRIBUTING/AllowedContributing Guide - Crawl4AI Documentation (v0.8.x)Contributing to Crawl4AI🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/adaptive-strategies/AllowedAdaptive Strategies - Crawl4AI Documentation (v0.8.x)Advanced Adaptive Strategies🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/advanced-features/AllowedOverview - Crawl4AI Documentation (v0.8.x)Overview of Some Important Advanced Features🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/anti-bot-and-fallback/AllowedAnti-Bot & Fallback - Crawl4AI Documentation (v0.8.x)Anti-Bot Detection & Fallback🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/crawl-dispatcher/AllowedCrawl Dispatcher - Crawl4AI Documentation (v0.8.x)Crawl Dispatcher🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/file-downloading/AllowedFile Downloading - Crawl4AI Documentation (v0.8.x)Download Handling in Crawl4AI🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/hooks-auth/AllowedHooks & Auth - Crawl4AI Documentation (v0.8.x)Hooks & Auth in AsyncWebCrawler🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/identity-based-crawling/AllowedIdentity Based Crawling - Crawl4AI Documentation (v0.8.x)Preserve Your Identity with Crawl4AI🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/lazy-loading/AllowedLazy Loading - Crawl4AI Documentation (v0.8.x)Missing H1🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/multi-url-crawling/AllowedMulti-URL Crawling - Crawl4AI Documentation (v0.8.x)Advanced Multi-URL Crawling with Dispatchers🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/network-console-capture/AllowedNetwork & Console Capture - Crawl4AI Documentation (v0.8.x)Network Requests & Console Message Capturing🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/pdf-parsing/AllowedPDF Parsing - Crawl4AI Documentation (v0.8.x)PDF Processing Strategies🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/proxy-security/AllowedProxy & Security - Crawl4AI Documentation (v0.8.x)Proxy & Security🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/session-management/AllowedSession Management - Crawl4AI Documentation (v0.8.x)Session Management🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/ssl-certificate/AllowedSSL Certificate - Crawl4AI Documentation (v0.8.x)SSLCertificate Reference🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/undetected-browser/AllowedUndetected Browser - Crawl4AI Documentation (v0.8.x)Undetected Browser Mode🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/advanced/virtual-scroll/AllowedVirtual Scroll - Crawl4AI Documentation (v0.8.x)Virtual Scroll🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/api/adaptive-crawler/AllowedAdaptiveCrawler - Crawl4AI Documentation (v0.8.x)AdaptiveCrawler🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/api/arun/Allowedarun() - Crawl4AI Documentation (v0.8.x)arun() Parameter Guide (New Approach)🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/api/arun_many/Allowedarun_many() - Crawl4AI Documentation (v0.8.x)arun_many(...) Reference🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/api/async-webcrawler/AllowedAsyncWebCrawler - Crawl4AI Documentation (v0.8.x)AsyncWebCrawler🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/api/c4a-script-reference/AllowedC4A-Script Reference - Crawl4AI Documentation (v0.8.x)C4A-Script API Reference🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/api/crawl-result/AllowedCrawlResult - Crawl4AI Documentation (v0.8.x)CrawlResult Reference🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/api/digest/Alloweddigest() - Crawl4AI Documentation (v0.8.x)digest()🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/api/parameters/AllowedBrowser, Crawler & LLM Config - Crawl4AI Documentation (v0.8.x)1. BrowserConfig – Controlling the Browser🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/api/strategies/AllowedStrategies - Crawl4AI Documentation (v0.8.x)Extraction & Chunking Strategies API🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/apps/AllowedDemo Apps - Crawl4AI Documentation (v0.8.x)🚀 Crawl4AI Interactive Apps🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/apps/c4a-script/AllowedC4A-Script Interactive Tutorial | Crawl4AIMissing H1
/apps/crawl4ai-assistant/AllowedCrawl4AI Assistant - Chrome Extension for Visual Web ScrapingCrawl4AI Assistant
/apps/llmtxt/AllowedCrawl4AI LLM Context BuilderCrawl4AI LLM Context Builder
/blog/AllowedBlog Home - Crawl4AI Documentation (v0.8.x)Crawl4AI Blog🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/blog/articles/adaptive-crawling-revolution/AllowedAdaptive Crawling: Building Dynamic Knowledge That Grows on Demand - Crawl4AI Documentation (v0.8.x)Adaptive Crawling: Building Dynamic Knowledge That Grows on Demand🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/blog/articles/llm-context-revolution/AllowedThe LLM Context Protocol: Why Your AI Assistant Needs Memory, Reasoning, and Examples - Crawl4AI Documentation (v0.8.x)The LLM Context Protocol: Why Your AI Assistant Needs Memory, Reasoning, and Examples🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/branding/AllowedBrand Book - Crawl4AI Documentation (v0.8.x)🎨 Crawl4AI Brand Book🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/adaptive-crawling/AllowedAdaptive Crawling - Crawl4AI Documentation (v0.8.x)Adaptive Web Crawling🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/ask-ai/AllowedAsk AI - Crawl4AI Documentation (v0.8.x)Missing H1🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/browser-crawler-config/AllowedBrowser, Crawler & LLM Config - Crawl4AI Documentation (v0.8.x)Browser, Crawler & LLM Configuration (Quick Overview)🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/c4a-script/AllowedC4A-Script - Crawl4AI Documentation (v0.8.x)C4A-Script: Visual Web Automation Made Simple🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/cache-modes/AllowedCache Modes - Crawl4AI Documentation (v0.8.x)Crawl4AI Cache System and Migration Guide🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/cli/AllowedCommand Line Interface - Crawl4AI Documentation (v0.8.x)Crawl4AI CLI Guide🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/content-selection/AllowedContent Selection - Crawl4AI Documentation (v0.8.x)Content Selection🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/crawler-result/AllowedCrawler Result - Crawl4AI Documentation (v0.8.x)Crawl Result and Output🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/deep-crawling/AllowedDeep Crawling - Crawl4AI Documentation (v0.8.x)Deep Crawling🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/examples/AllowedCode Examples - Crawl4AI Documentation (v0.8.x)Code Examples🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/fit-markdown/AllowedFit Markdown - Crawl4AI Documentation (v0.8.x)Fit Markdown with Pruning & BM25🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/installation/AllowedInstallation - Crawl4AI Documentation (v0.8.x)Installation & Setup (2023 Edition)🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/link-media/AllowedLink & Media - Crawl4AI Documentation (v0.8.x)Link & Media🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/llmtxt/AllowedLlmtxt - Crawl4AI Documentation (v0.8.x)Missing H1🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/local-files/AllowedLocal Files & Raw HTML - Crawl4AI Documentation (v0.8.x)Prefix-Based Input Handling in Crawl4AI🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/markdown-generation/AllowedMarkdown Generation - Crawl4AI Documentation (v0.8.x)Markdown Generation Basics🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/page-interaction/AllowedPage Interaction - Crawl4AI Documentation (v0.8.x)Page Interaction🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/quickstart/AllowedQuick Start - Crawl4AI Documentation (v0.8.x)Getting Started with Crawl4AI🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/self-hosting/AllowedSelf-Hosting Guide - Crawl4AI Documentation (v0.8.x)Self-Hosting Crawl4AI 🚀🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/simple-crawling/AllowedSimple Crawling - Crawl4AI Documentation (v0.8.x)Simple Crawling🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/core/url-seeding/AllowedURL Seeding - Crawl4AI Documentation (v0.8.x)URL Seeding: The Smart Way to Crawl at Scale🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/extraction/chunking/AllowedChunking - Crawl4AI Documentation (v0.8.x)Chunking Strategies🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/extraction/clustring-strategies/AllowedClustering Strategies - Crawl4AI Documentation (v0.8.x)Cosine Strategy🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/extraction/llm-strategies/AllowedLLM Strategies - Crawl4AI Documentation (v0.8.x)Extracting JSON (LLM)🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/extraction/no-llm-strategies/AllowedLLM-Free Strategies - Crawl4AI Documentation (v0.8.x)Extracting JSON (No LLM)🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
/marketplace/AllowedMarketplace - Crawl4AI[ Marketplace ]
/marketplace/admin/AllowedAdmin Dashboard - Crawl4AI Marketplace[ Admin Access ]
/stats/AllowedGrowth - Crawl4AI Documentation (v0.8.x)Growth🚀🤖 Crawl4AI, Open-source LLM-Friendly Web Crawler & Scraper
No rows found, please edit your search term.

OpenGraph metadata

No URLs with OpenGraph data (og:* or twitter:* meta tags).


Heading structure

Found 63 row(s).
Heading structureCountErrors 🔽URL
  • <h1> Extracting JSON (No LLM) [#extracting-json-no-llm]
    • <h2> 1. Intro to Schema-Based Extraction [#1-intro-to-schema-based-extraction]
    • <h2> 2. Simple Example: Crypto Prices [#2-simple-example-crypto-prices]
      • <h3> XPath Example with raw:// HTML [#xpath-example-with-raw-html]
    • <h2> 3. Advanced Schema & Nested Structures [#3-advanced-schema-nested-structures]
      • <h3> Sample E-Commerce HTML [#sample-e-commerce-html]
      • <h3> Running the Extraction [#running-the-extraction]
    • <h2> 4. RegexExtractionStrategy - Fast Pattern-Based Extraction [#4-regexextractionstrategy-fast-pattern-based-extraction]
      • <h3> Key Features [#key-features]
      • <h3> Simple Example: Extracting Common Entities [#simple-example-extracting-common-entities]
      • <h3> Available Built-in Patterns [#available-built-in-patterns]
      • <h3> Custom Pattern Example [#custom-pattern-example]
      • <h3> LLM-Assisted Pattern Generation [#llm-assisted-pattern-generation]
      • <h3> Extraction Results Format [#extraction-results-format]
    • <h2> 5. Why "No LLM" Is Often Better [#5-why-no-llm-is-often-better]
    • <h2> 6. Base Element Attributes & Additional Fields [#6-base-element-attributes-additional-fields]
    • <h2> 7. Putting It All Together: Larger Example [#7-putting-it-all-together-larger-example]
    • <h2> 8. Extracting Sibling Data with source [#sibling-data]
      • <h3> Syntax [#syntax]
      • <h3> Example: Hacker News [#example-hacker-news]
    • <h2> 9. Tips & Best Practices [#9-tips-best-practices]
    • <h2> 10. Schema Generation Utility [#10-schema-generation-utility]
      • <h3> Using the Schema Generator [#using-the-schema-generator]
      • <h3> Schema Validation [#schema-validation]
      • <h3> Token Usage Tracking [#token-usage-tracking]
      • <h3> LLM Provider Options [#llm-provider-options]
      • <h3> Benefits of Schema Generation [#benefits-of-schema-generation]
      • <h3> Best Practices [#best-practices]
      • <h3> Multi-Sample Schema Generation [#multi-sample-schema-generation]
    • <h2> HTML Sample 2 (Product B): [#html-sample-2-product-b]
    • <h2> HTML Sample 3 (Product C): [#html-sample-3-product-c]
  • <h1> Provide instructions for stable selectors [#provide-instructions-for-stable-selectors]
  • <h1> Generate schema with multi-sample awareness [#generate-schema-with-multi-sample-awareness]
  • <h1> The generated schema will use stable selectors like: [#the-generated-schema-will-use-stable-selectors-like]
  • <h1> a[href*="/m/"] instead of tr:nth-child(6) td a [#ahrefm-instead-of-trnth-child6-td-a]
    • <h2> 11. Conclusion [#11-conclusion]
365/extraction/no-llm-strategies/
  • <h2> Handling Lazy-Loaded Images [#handling-lazy-loaded-images]
    • <h3> Example: Ensuring Lazy Images Appear [#example-ensuring-lazy-images-appear]
  • <h2> Combining with Other Link & Media Filters [#combining-with-other-link-media-filters]
  • <h2> Tips & Troubleshooting [#tips-troubleshooting]
44/advanced/lazy-loading/
  • <h2> Welcome to C4A-Script Tutorial!
  • <h2> C4A-Script Editor
    • <h3> Recording Timeline
  • <h2> Playground
44/apps/c4a-script/
  • <h1> Crawl4AI Assistant
    • <h3> You don't need Puppeteer. You need Crawl4AI Cloud.
    • <h3> Click2Crawl
    • <h3> Script Builder (Alpha)
    • <h3> Markdown Extraction (New!)
    • <h2> Quick Start
    • <h2> Explore Our Tools
      • <h3> Click2Crawl
      • <h3> Script Builder
      • <h3> Markdown Extraction
      • <h3> 🎯 Click2Crawl
      • <h3> 🔴 Script Builder
      • <h3> 📝 Markdown Extraction
    • <h2> See the Generated Code & Extracted Data
    • <h2> Crawl4AI Cloud
      • <h3> 🚀 Join C4AI Cloud Waiting List
      • <h3> Data Uploaded Successfully!
    • <h2> More Features Coming Soon
      • <h3> Direct Data Download
      • <h3> Smart Field Detection
204/apps/crawl4ai-assistant/
  • <h1> 1. BrowserConfig – Controlling the Browser [#1-browserconfig-controlling-the-browser]
    • <h2> 1.1 Parameter Highlights [#11-parameter-highlights]
  • <h1> 2. CrawlerRunConfig – Controlling Each Crawl [#2-crawlerrunconfig-controlling-each-crawl]
    • <h2> 2.1 Parameter Highlights [#21-parameter-highlights]
      • <h3> A) Content Processing [#a-content-processing]
      • <h3> B) Browser Location and Identity [#b-browser-location-and-identity]
      • <h3> C) Caching & Session [#c-caching-session]
      • <h3> D) Page Navigation & Timing [#d-page-navigation-timing]
      • <h3> E) Page Interaction [#e-page-interaction]
      • <h3> F) Media Handling [#f-media-handling]
      • <h3> G) Link/Domain Handling [#g-linkdomain-handling]
      • <h3> H) Debug, Logging & Network Monitoring [#h-debug-logging-network-monitoring]
      • <h3> I) Connection & HTTP Parameters [#i-connection-http-parameters]
      • <h3> J) Virtual Scroll Configuration [#j-virtual-scroll-configuration]
      • <h3> K) URL Matching Configuration [#k-url-matching-configuration]
      • <h3> L) Advanced Crawling Features [#l-advanced-crawling-features]
    • <h2> 2.2 Helper Methods [#22-helper-methods]
      • <h3> Class-Level Defaults (set_defaults / get_defaults / reset_defaults) [#class-level-defaults-set_defaults-get_defaults-reset_defaults]
    • <h2> 2.3 Example Usage [#23-example-usage]
    • <h2> 2.4 Compliance & Ethics [#24-compliance-ethics]
  • <h1> 3. LLMConfig - Setting up LLM providers [#3-llmconfig-setting-up-llm-providers]
    • <h2> 3.1 Parameters [#31-parameters]
    • <h2> 3.2 Example Usage [#32-example-usage]
    • <h2> 4. Putting It All Together [#4-putting-it-all-together]
243/api/parameters/
  • <h1> 🎨 Crawl4AI Brand Book [#crawl4ai-brand-book]
  • <h1> Crawl4AI Brand Guidelines
    • <h2> 📖 About This Guide [#about-this-guide]
    • <h2> Color System
      • <h3> Primary Colors [#primary-colors]
      • <h3> Background Colors [#background-colors]
      • <h3> Text Colors [#text-colors]
      • <h3> Semantic Colors [#semantic-colors]
    • <h2> Typography
      • <h3> Font Family [#font-family]
      • <h3> Type Scale [#type-scale]
  • <h1> The Quick Brown Fox Jumps Over
    • <h2> Advanced Web Scraping Features
      • <h3> Installation and Setup Guide
    • <h2> Components
      • <h3> Buttons [#buttons]
      • <h3> Primary Button
      • <h3> Secondary Button
      • <h3> Accent Button
      • <h3> Ghost Button
      • <h3> Badges & Status Indicators [#badges-status-indicators]
      • <h3> Status Badges
      • <h3> Cards [#cards]
      • <h3> 🎨 C4A-Script Editor
      • <h3> 🧠 LLM Context Builder
      • <h3> Terminal Window [#terminal-window]
    • <h2> Spacing & Layout
      • <h3> Spacing System [#spacing-system]
      • <h3> Layout Patterns [#layout-patterns]
    • <h2> Usage Guidelines
      • <h3> When to Use Each Style [#when-to-use-each-style]
      • <h3> Do's and Don'ts [#dos-and-donts]
    • <h2> Accessibility
      • <h3> Color Contrast [#color-contrast]
      • <h3> Focus States [#focus-states]
      • <h3> Motion [#motion]
    • <h2> CSS Variables
    • <h2> Resources
      • <h3> Download Assets [#download-assets]
      • <h3> Reference Files [#reference-files]
      • <h3> Questions? [#questions]
      • <h3> 🎨 Keep It Terminal
423/branding/
  • <h1> Prefix-Based Input Handling in Crawl4AI [#prefix-based-input-handling-in-crawl4ai]
    • <h2> Crawling a Web URL [#crawling-a-web-url]
    • <h2> Crawling a Local HTML File [#crawling-a-local-html-file]
    • <h2> Crawling Raw HTML Content [#crawling-raw-html-content]
  • <h1> Complete Example [#complete-example]
  • <h1> Conclusion [#conclusion]
63/core/local-files/
  • <h1> Chunking Strategies [#chunking-strategies]
    • <h3> Why Use Chunking? [#why-use-chunking]
    • <h3> Methods of Chunking [#methods-of-chunking]
    • <h3> Combining Chunking with Cosine Similarity [#combining-chunking-with-cosine-similarity]
43/extraction/chunking/
  • <h1> [ Admin Access ]
  • <h1> [ Admin Dashboard ]
    • <h2> Dashboard Overview
      • <h3> Quick Actions
    • <h2> Apps Management
    • <h2> Articles Management
    • <h2> Categories Management
    • <h2> Sponsors Management
    • <h2> Add/Edit [#modal-title]
92/marketplace/admin/
  • <h1> 🚀🤖 Crawl4AI: Open-Source LLM-Friendly Web Crawler & Scraper [#crawl4ai-open-source-llm-friendly-web-crawler-scraper]
    • <h2> 🆕 AI Assistant Skill Now Available! [#ai-assistant-skill-now-available]
      • <h3> 🤖 Crawl4AI Skill for Claude & AI Assistants
    • <h2> 🎯 New: Adaptive Web Crawling [#new-adaptive-web-crawling]
    • <h2> Quick Start [#quick-start]
    • <h2> Video Tutorial [#video-tutorial]
    • <h2> What Does Crawl4AI Do? [#what-does-crawl4ai-do]
    • <h2> Documentation Structure [#documentation-structure]
    • <h2> How You Can Support [#how-you-can-support]
    • <h2> Quick Links [#quick-links]
100/
  • <h1> [ Marketplace ]
    • <h2> > Latest Apps
    • <h2> > Latest Articles
    • <h2> # Trending
      • <h3> + Submit Your Tool
    • <h2> > More Apps
      • <h3> About Marketplace
      • <h3> Become a Sponsor
80/marketplace/
  • <h1> Download Handling in Crawl4AI [#download-handling-in-crawl4ai]
    • <h2> Enabling Downloads [#enabling-downloads]
    • <h2> Specifying Download Location [#specifying-download-location]
    • <h2> Triggering Downloads [#triggering-downloads]
    • <h2> Accessing Downloaded Files [#accessing-downloaded-files]
    • <h2> Example: Downloading Multiple Files [#example-downloading-multiple-files]
    • <h2> Important Considerations [#important-considerations]
70/advanced/file-downloading/
  • <h1> Network Requests & Console Message Capturing [#network-requests-console-message-capturing]
    • <h2> Configuration [#configuration]
    • <h2> Example Usage [#example-usage]
    • <h2> Captured Data Structure [#captured-data-structure]
      • <h3> Network Requests [#network-requests]
      • <h3> Console Messages [#console-messages]
    • <h2> Key Benefits [#key-benefits]
    • <h2> Use Cases [#use-cases]
80/advanced/network-console-capture/
  • <h1> Simple Crawling [#simple-crawling]
    • <h2> Basic Usage [#basic-usage]
    • <h2> Understanding the Response [#understanding-the-response]
    • <h2> Adding Basic Options [#adding-basic-options]
    • <h2> Handling Errors [#handling-errors]
    • <h2> Logging and Debugging [#logging-and-debugging]
    • <h2> Complete Example [#complete-example]
70/core/simple-crawling/
  • <h1> Self-Hosting Crawl4AI 🚀 [#self-hosting-crawl4ai]
    • <h2> Why Self-Host? [#why-self-host]
    • <h2> Table of Contents [#table-of-contents]
    • <h2> Prerequisites [#prerequisites]
    • <h2> Installation [#installation]
      • <h3> Option 1: Using Pre-built Docker Hub Images (Recommended) [#option-1-using-pre-built-docker-hub-images-recommended]
      • <h3> Option 2: Using Docker Compose [#option-2-using-docker-compose]
      • <h3> Option 3: Manual Local Build & Run [#option-3-manual-local-build-run]
    • <h2> MCP (Model Context Protocol) Support [#mcp-model-context-protocol-support]
      • <h3> What is MCP? [#what-is-mcp]
      • <h3> Connecting via MCP [#connecting-via-mcp]
      • <h3> Using with Claude Code [#using-with-claude-code]
      • <h3> Available MCP Tools [#available-mcp-tools]
      • <h3> Testing MCP Connections [#testing-mcp-connections]
      • <h3> MCP Schemas [#mcp-schemas]
    • <h2> Additional API Endpoints [#additional-api-endpoints]
      • <h3> HTML Extraction Endpoint [#html-extraction-endpoint]
      • <h3> Screenshot Endpoint [#screenshot-endpoint]
      • <h3> PDF Export Endpoint [#pdf-export-endpoint]
      • <h3> JavaScript Execution Endpoint [#javascript-execution-endpoint]
    • <h2> User-Provided Hooks API [#user-provided-hooks-api]
      • <h3> Hook Information Endpoint [#hook-information-endpoint]
      • <h3> Available Hook Points [#available-hook-points]
      • <h3> Using Hooks in Requests [#using-hooks-in-requests]
      • <h3> Hook Examples with Real URLs [#hook-examples-with-real-urls]
      • <h3> Security Best Practices [#security-best-practices]
      • <h3> Hook Response Information [#hook-response-information]
      • <h3> Error Handling [#error-handling]
      • <h3> Complete Example: Safe Multi-Hook Crawling [#complete-example-safe-multi-hook-crawling]
      • <h3> Hooks Utility: Function-Based Approach (Python) [#hooks-utility-function-based-approach-python]
    • <h2> Job Queue & Webhook API [#job-queue-webhook-api]
      • <h3> Why Use the Job Queue API? [#why-use-the-job-queue-api]
      • <h3> Available Endpoints [#available-endpoints]
      • <h3> Webhook Configuration [#webhook-configuration]
      • <h3> Usage Examples [#usage-examples]
      • <h3> Webhook Best Practices [#webhook-best-practices]
      • <h3> Use Cases [#use-cases]
      • <h3> Troubleshooting [#troubleshooting]
    • <h2> Dockerfile Parameters [#dockerfile-parameters]
      • <h3> Build Arguments Explained [#build-arguments-explained]
      • <h3> Build Best Practices [#build-best-practices]
    • <h2> Using the API [#using-the-api]
      • <h3> Playground Interface [#playground-interface]
      • <h3> Python SDK [#python-sdk]
      • <h3> Second Approach: Direct API Calls [#second-approach-direct-api-calls]
      • <h3> LLM Configuration Examples [#llm-configuration-examples]
      • <h3> REST API Examples [#rest-api-examples]
    • <h2> Real-time Monitoring & Operations [#real-time-monitoring-operations]
      • <h3> Monitoring Dashboard [#monitoring-dashboard]
      • <h3> Monitor API Endpoints [#monitor-api-endpoints]
      • <h3> WebSocket Streaming [#websocket-streaming]
      • <h3> Control Actions [#control-actions]
      • <h3> Production Integration [#production-integration]
      • <h3> Quick Health Check [#quick-health-check]
    • <h2> Server Configuration [#server-configuration]
      • <h3> Understanding config.yml [#understanding-configyml]
      • <h3> Customizing Your Configuration [#customizing-your-configuration]
      • <h3> Configuration Recommendations [#configuration-recommendations]
    • <h2> Getting Help [#getting-help]
    • <h2> Summary [#summary]
600/core/self-hosting/
  • <h1> Deep Crawling [#deep-crawling]
    • <h2> 1. Quick Example [#1-quick-example]
    • <h2> 2. Understanding Deep Crawling Strategy Options [#2-understanding-deep-crawling-strategy-options]
      • <h3> 2.1 BFSDeepCrawlStrategy (Breadth-First Search) [#21-bfsdeepcrawlstrategy-breadth-first-search]
      • <h3> 2.2 DFSDeepCrawlStrategy (Depth-First Search) [#22-dfsdeepcrawlstrategy-depth-first-search]
      • <h3> 2.3 BestFirstCrawlingStrategy (⭐️ - Recommended Deep crawl strategy) [#23-bestfirstcrawlingstrategy-recommended-deep-crawl-strategy]
    • <h2> 3. Streaming vs. Non-Streaming Results [#3-streaming-vs-non-streaming-results]
      • <h3> 3.1 Non-Streaming Mode (Default) [#31-non-streaming-mode-default]
      • <h3> 3.2 Streaming Mode [#32-streaming-mode]
    • <h2> 4. Filtering Content with Filter Chains [#4-filtering-content-with-filter-chains]
      • <h3> 4.1 Basic URL Pattern Filter [#41-basic-url-pattern-filter]
      • <h3> 4.2 Combining Multiple Filters [#42-combining-multiple-filters]
      • <h3> 4.3 Available Filter Types [#43-available-filter-types]
    • <h2> 5. Using Scorers for Prioritized Crawling [#5-using-scorers-for-prioritized-crawling]
      • <h3> 5.1 KeywordRelevanceScorer [#51-keywordrelevancescorer]
    • <h2> 6. Advanced Filtering Techniques [#6-advanced-filtering-techniques]
      • <h3> 6.1 SEO Filter for Quality Assessment [#61-seo-filter-for-quality-assessment]
      • <h3> 6.2 Content Relevance Filter [#62-content-relevance-filter]
    • <h2> 7. Building a Complete Advanced Crawler [#7-building-a-complete-advanced-crawler]
    • <h2> 8. Limiting and Controlling Crawl Size [#8-limiting-and-controlling-crawl-size]
      • <h3> 8.1 Using max_pages [#81-using-max_pages]
      • <h3> 8.2 Using score_threshold [#82-using-score_threshold]
    • <h2> 9. Common Pitfalls & Tips [#9-common-pitfalls-tips]
    • <h2> 10. Crash Recovery for Long-Running Crawls [#10-crash-recovery-for-long-running-crawls]
      • <h3> 10.1 Enabling State Persistence [#101-enabling-state-persistence]
      • <h3> 10.2 State Structure [#102-state-structure]
      • <h3> 10.3 Resuming from a Checkpoint [#103-resuming-from-a-checkpoint]
      • <h3> 10.4 Manual State Export [#104-manual-state-export]
      • <h3> 10.5 Complete Example: Redis-Based Recovery [#105-complete-example-redis-based-recovery]
      • <h3> 10.6 Zero Overhead [#106-zero-overhead]
    • <h2> 11. Cancellation Support for Deep Crawls [#11-cancellation-support-for-deep-crawls]
      • <h3> 11.1 Two Ways to Cancel [#111-two-ways-to-cancel]
      • <h3> 11.2 Checking Cancellation Status [#112-checking-cancellation-status]
      • <h3> 11.3 State Notifications Include Cancelled Flag [#113-state-notifications-include-cancelled-flag]
      • <h3> 11.4 Key Behaviors [#114-key-behaviors]
      • <h3> 11.5 Complete Example: Cloud Platform Job Cancellation [#115-complete-example-cloud-platform-job-cancellation]
      • <h3> 11.6 Supported Strategies [#116-supported-strategies]
    • <h2> 12. Prefetch Mode for Fast URL Discovery [#12-prefetch-mode-for-fast-url-discovery]
      • <h3> 12.1 Enabling Prefetch Mode [#121-enabling-prefetch-mode]
      • <h3> 12.2 What Gets Skipped [#122-what-gets-skipped]
      • <h3> 12.3 Performance Benefit [#123-performance-benefit]
      • <h3> 12.4 Two-Phase Crawling Pattern [#124-two-phase-crawling-pattern]
      • <h3> 12.5 Use Cases [#125-use-cases]
    • <h2> 13. Summary & Next Steps [#13-summary-next-steps]
440/core/deep-crawling/
  • <h1> Page Interaction [#page-interaction]
    • <h2> 1. JavaScript Execution [#1-javascript-execution]
      • <h3> Basic Execution [#basic-execution]
      • <h3> Execution Order [#execution-order]
    • <h2> 2. Wait Conditions [#2-wait-conditions]
      • <h3> 2.1 CSS-Based Waiting [#21-css-based-waiting]
      • <h3> 2.2 JavaScript-Based Waiting [#22-javascript-based-waiting]
    • <h2> 3. Handling Dynamic Content [#3-handling-dynamic-content]
      • <h3> 3.1 Load More Example (Hacker News “More” Link) [#31-load-more-example-hacker-news-more-link]
      • <h3> 3.2 Form Interaction [#32-form-interaction]
    • <h2> 4. Timing Control [#4-timing-control]
    • <h2> 5. Multi-Step Interaction Example [#5-multi-step-interaction-example]
    • <h2> 6. Combine Interaction with Extraction [#6-combine-interaction-with-extraction]
    • <h2> 7. Shadow DOM Flattening [#7-shadow-dom-flattening]
    • <h2> 8. Relevant CrawlerRunConfig Parameters [#8-relevant-crawlerrunconfig-parameters]
    • <h2> 9. Conclusion [#9-conclusion]
    • <h2> 10. Virtual Scrolling [#10-virtual-scrolling]
      • <h3> Virtual Scroll vs JavaScript Scrolling [#virtual-scroll-vs-javascript-scrolling]
180/core/page-interaction/
  • <h1> Cosine Strategy [#cosine-strategy]
    • <h2> How It Works [#how-it-works]
    • <h2> Basic Usage [#basic-usage]
    • <h2> Configuration Options [#configuration-options]
      • <h3> Core Parameters [#core-parameters]
      • <h3> Parameter Details [#parameter-details]
    • <h2> Use Cases [#use-cases]
      • <h3> 1. Article Content Extraction [#1-article-content-extraction]
      • <h3> 2. Product Review Analysis [#2-product-review-analysis]
      • <h3> 3. Technical Documentation [#3-technical-documentation]
    • <h2> Advanced Features [#advanced-features]
      • <h3> Custom Clustering [#custom-clustering]
      • <h3> Content Filtering Pipeline [#content-filtering-pipeline]
    • <h2> Best Practices [#best-practices]
    • <h2> Error Handling [#error-handling]
150/extraction/clustring-strategies/
  • <h1> Crawl Result and Output [#crawl-result-and-output]
    • <h2> 1. The CrawlResult Model [#1-the-crawlresult-model]
      • <h3> Table: Key Fields in CrawlResult [#table-key-fields-in-crawlresult]
    • <h2> 2. HTML Variants [#2-html-variants]
      • <h3> html: Raw HTML [#html-raw-html]
      • <h3> cleaned_html: Sanitized [#cleaned_html-sanitized]
    • <h2> 3. Markdown Generation [#3-markdown-generation]
      • <h3> 3.1 markdown [#31-markdown]
      • <h3> 3.2 Basic Example with a Markdown Generator [#32-basic-example-with-a-markdown-generator]
    • <h2> 4. Structured Extraction: extracted_content [#4-structured-extraction-extracted_content]
      • <h3> Example: CSS Extraction with raw:// HTML [#example-css-extraction-with-raw-html]
    • <h2> 5. More Fields: Links, Media, Tables and More [#5-more-fields-links-media-tables-and-more]
      • <h3> 5.1 links [#51-links]
      • <h3> 5.2 media [#52-media]
      • <h3> 5.3 tables [#53-tables]
      • <h3> Accessing Table data: [#accessing-table-data]
      • <h3> Configuring Table Extraction: [#configuring-table-extraction]
      • <h3> Table Extraction Tips [#table-extraction-tips]
      • <h3> 5.4 screenshot, pdf, and mhtml [#54-screenshot-pdf-and-mhtml]
      • <h3> 5.5 ssl_certificate [#55-ssl_certificate]
    • <h2> 6. Accessing These Fields [#6-accessing-these-fields]
    • <h2> 7. Next Steps [#7-next-steps]
220/core/crawler-result/
  • <h1> Markdown Generation Basics [#markdown-generation-basics]
    • <h2> 1. Quick Example [#1-quick-example]
    • <h2> 2. How Markdown Generation Works [#2-how-markdown-generation-works]
      • <h3> 2.1 HTML-to-Text Conversion (Forked & Modified) [#21-html-to-text-conversion-forked-modified]
      • <h3> 2.2 Link Citations & References [#22-link-citations-references]
      • <h3> 2.3 Optional Content Filters [#23-optional-content-filters]
    • <h2> 3. Configuring the Default Markdown Generator [#3-configuring-the-default-markdown-generator]
    • <h2> 4. Selecting the HTML Source for Markdown Generation [#4-selecting-the-html-source-for-markdown-generation]
      • <h3> HTML Source Options [#html-source-options]
      • <h3> When to Use Each Option [#when-to-use-each-option]
    • <h2> 5. Content Filters [#5-content-filters]
      • <h3> 5.1 BM25ContentFilter [#51-bm25contentfilter]
      • <h3> 5.2 PruningContentFilter [#52-pruningcontentfilter]
      • <h3> 5.3 LLMContentFilter [#53-llmcontentfilter]
    • <h2> 6. Using Fit Markdown [#6-using-fit-markdown]
    • <h2> 7. The MarkdownGenerationResult Object [#7-the-markdowngenerationresult-object]
    • <h2> 8. Combining Filters (BM25 + Pruning) in Two Passes [#8-combining-filters-bm25-pruning-in-two-passes]
      • <h3> Two-Pass Example [#two-pass-example]
      • <h3> What’s Happening? [#whats-happening]
      • <h3> Tips & Variations [#tips-variations]
      • <h3> One-Pass Combination? [#one-pass-combination]
    • <h2> 9. Common Pitfalls & Tips [#9-common-pitfalls-tips]
    • <h2> 10. Summary & Next Steps [#10-summary-next-steps]
230/core/markdown-generation/
  • <h1> Content Selection [#content-selection]
    • <h2> 1. CSS-Based Selection [#1-css-based-selection]
      • <h3> 1.1 Using css_selector [#11-using-css_selector]
      • <h3> 1.2 Using target_elements [#12-using-target_elements]
    • <h2> 2. Content Filtering & Exclusions [#2-content-filtering-exclusions]
      • <h3> 2.1 Basic Overview [#21-basic-overview]
      • <h3> 2.2 Example Usage [#22-example-usage]
    • <h2> 3. Handling Iframes [#3-handling-iframes]
    • <h2> 3.1 Flattening Shadow DOM [#31-flattening-shadow-dom]
    • <h2> 4. Structured Extraction Examples [#4-structured-extraction-examples]
      • <h3> 4.1 Pattern-Based with JsonCssExtractionStrategy [#41-pattern-based-with-jsoncssextractionstrategy]
      • <h3> 4.2 LLM-Based Extraction [#42-llm-based-extraction]
    • <h2> 5. Comprehensive Example [#5-comprehensive-example]
    • <h2> 6. Scraping Modes [#6-scraping-modes]
      • <h3> Performance Considerations [#performance-considerations]
      • <h3> Backward Compatibility [#backward-compatibility]
    • <h2> 7. Combining CSS Selection Methods [#7-combining-css-selection-methods]
    • <h2> 8. Conclusion [#8-conclusion]
180/core/content-selection/
  • <h1> Advanced Adaptive Strategies [#advanced-adaptive-strategies]
    • <h2> Overview [#overview]
    • <h2> The Three-Layer Scoring System [#the-three-layer-scoring-system]
      • <h3> 1. Coverage Score [#1-coverage-score]
      • <h3> 2. Consistency Score [#2-consistency-score]
      • <h3> 3. Saturation Score [#3-saturation-score]
    • <h2> Link Ranking Algorithm [#link-ranking-algorithm]
      • <h3> Expected Information Gain [#expected-information-gain]
    • <h2> Domain-Specific Configurations [#domain-specific-configurations]
      • <h3> Technical Documentation [#technical-documentation]
      • <h3> News & Articles [#news-articles]
      • <h3> E-commerce [#e-commerce]
      • <h3> Research & Academic [#research-academic]
    • <h2> Performance Optimization [#performance-optimization]
      • <h3> Memory Management [#memory-management]
      • <h3> Parallel Processing [#parallel-processing]
    • <h2> Debugging & Analysis [#debugging-analysis]
      • <h3> Enable Verbose Logging [#enable-verbose-logging]
      • <h3> Analyze Crawl Patterns [#analyze-crawl-patterns]
      • <h3> Export for Analysis [#export-for-analysis]
    • <h2> Custom Strategies [#custom-strategies]
      • <h3> Implementing a Custom Strategy [#implementing-a-custom-strategy]
      • <h3> Combining Strategies [#combining-strategies]
    • <h2> Best Practices [#best-practices]
      • <h3> 1. Start Conservative [#1-start-conservative]
      • <h3> 2. Monitor Resource Usage [#2-monitor-resource-usage]
      • <h3> 3. Use Domain Knowledge [#3-use-domain-knowledge]
      • <h3> 4. Validate Results [#4-validate-results]
    • <h2> Next Steps [#next-steps]
290/advanced/adaptive-strategies/
  • <h1> Crawl4AI Blog [#crawl4ai-blog]
    • <h2> Featured Articles [#featured-articles]
      • <h3> When to Stop Crawling: The Art of Knowing "Enough" [#when-to-stop-crawling-the-art-of-knowing-enough]
      • <h3> The LLM Context Protocol: Why Your AI Assistant Needs Memory, Reasoning, and Examples [#the-llm-context-protocol-why-your-ai-assistant-needs-memory-reasoning-and-examples]
    • <h2> Latest Release [#latest-release]
      • <h3> Crawl4AI v0.8.0 – Crash Recovery & Prefetch Mode [#crawl4ai-v080-crash-recovery-prefetch-mode]
    • <h2> Recent Releases [#recent-releases]
      • <h3> Crawl4AI v0.7.8 – Stability & Bug Fix Release [#crawl4ai-v078-stability-bug-fix-release]
      • <h3> Crawl4AI v0.7.7 – The Self-Hosting & Monitoring Update [#crawl4ai-v077-the-self-hosting-monitoring-update]
      • <h3> Crawl4AI v0.7.6 – The Webhook Infrastructure Update [#crawl4ai-v076-the-webhook-infrastructure-update]
    • <h2> Older Releases [#older-releases]
    • <h2> Project History [#project-history]
    • <h2> Stay Updated [#stay-updated]
130/blog/
  • <h1> arun() Parameter Guide (New Approach) [#arun-parameter-guide-new-approach]
    • <h2> 1. Core Usage [#1-core-usage]
    • <h2> 2. Cache Control [#2-cache-control]
    • <h2> 3. Content Processing & Selection [#3-content-processing-selection]
      • <h3> 3.1 Text Processing [#31-text-processing]
      • <h3> 3.2 Content Selection [#32-content-selection]
      • <h3> 3.3 Link Handling [#33-link-handling]
      • <h3> 3.4 Media Filtering [#34-media-filtering]
    • <h2> 4. Page Navigation & Timing [#4-page-navigation-timing]
      • <h3> 4.1 Basic Browser Flow [#41-basic-browser-flow]
      • <h3> 4.2 JavaScript Execution [#42-javascript-execution]
      • <h3> 4.3 Anti-Bot [#43-anti-bot]
    • <h2> 5. Session Management [#5-session-management]
    • <h2> 6. Screenshot, PDF & Media Options [#6-screenshot-pdf-media-options]
    • <h2> 7. Extraction Strategy [#7-extraction-strategy]
    • <h2> 8. Comprehensive Example [#8-comprehensive-example]
    • <h2> 9. Best Practices [#9-best-practices]
    • <h2> 10. Conclusion [#10-conclusion]
180/api/arun/
  • <h1> Extraction & Chunking Strategies API [#extraction-chunking-strategies-api]
    • <h2> Extraction Strategies [#extraction-strategies]
      • <h3> LLMExtractionStrategy [#llmextractionstrategy]
      • <h3> RegexExtractionStrategy [#regexextractionstrategy]
      • <h3> CosineStrategy [#cosinestrategy]
      • <h3> JsonCssExtractionStrategy [#jsoncssextractionstrategy]
    • <h2> Chunking Strategies [#chunking-strategies]
      • <h3> RegexChunking [#regexchunking]
      • <h3> SlidingWindowChunking [#slidingwindowchunking]
      • <h3> OverlappingWindowChunking [#overlappingwindowchunking]
    • <h2> Usage Examples [#usage-examples]
      • <h3> LLM Extraction [#llm-extraction]
      • <h3> Regex Extraction [#regex-extraction]
      • <h3> CSS Extraction [#css-extraction]
      • <h3> Content Chunking [#content-chunking]
    • <h2> Best Practices [#best-practices]
160/api/strategies/
  • <h1> Hooks & Auth in AsyncWebCrawler [#hooks-auth-in-asyncwebcrawler]
    • <h2> Example: Using Hooks in AsyncWebCrawler [#example-using-hooks-in-asyncwebcrawler]
    • <h2> Hook Lifecycle Summary [#hook-lifecycle-summary]
    • <h2> When to Handle Authentication [#when-to-handle-authentication]
    • <h2> Additional Considerations [#additional-considerations]
    • <h2> Conclusion [#conclusion]
60/advanced/hooks-auth/
  • <h1> Overview of Some Important Advanced Features [#overview-of-some-important-advanced-features]
    • <h2> 1. Proxy Usage [#1-proxy-usage]
    • <h2> 2. Capturing PDFs & Screenshots [#2-capturing-pdfs-screenshots]
    • <h2> 3. Handling SSL Certificates [#3-handling-ssl-certificates]
    • <h2> 4. Custom Headers [#4-custom-headers]
    • <h2> 5. Session Persistence & Local Storage [#5-session-persistence-local-storage]
      • <h3> 5.1 storage_state [#51-storage_state]
      • <h3> 5.2 Exporting & Reusing State [#52-exporting-reusing-state]
    • <h2> 6. Robots.txt Compliance [#6-robotstxt-compliance]
    • <h2> Putting It All Together [#putting-it-all-together]
    • <h2> 7. Anti-Bot Features (Stealth Mode & Undetected Browser) [#7-anti-bot-features-stealth-mode-undetected-browser]
      • <h3> 7.1 Stealth Mode [#71-stealth-mode]
      • <h3> 7.2 Undetected Browser [#72-undetected-browser]
      • <h3> 7.3 Combining Both [#73-combining-both]
      • <h3> Choosing the Right Approach [#choosing-the-right-approach]
    • <h2> Conclusion & Next Steps [#conclusion-next-steps]
160/advanced/advanced-features/
  • <h1> C4A-Script: Visual Web Automation Made Simple [#c4a-script-visual-web-automation-made-simple]
    • <h2> What is C4A-Script? [#what-is-c4a-script]
      • <h3> Why C4A-Script? [#why-c4a-script]
    • <h2> Getting Started: Your First Script [#getting-started-your-first-script]
    • <h2> Interactive Tutorial & Live Demo [#interactive-tutorial-live-demo]
      • <h3> Running the Tutorial Locally [#running-the-tutorial-locally]
    • <h2> Core Concepts [#core-concepts]
      • <h3> Commands and Syntax [#commands-and-syntax]
      • <h3> Selectors: Finding Elements [#selectors-finding-elements]
      • <h3> Variables and Dynamic Content [#variables-and-dynamic-content]
    • <h2> Command Categories [#command-categories]
      • <h3> 🧭 Navigation Commands [#navigation-commands]
      • <h3> ⏱️ Wait Commands [#wait-commands]
      • <h3> 🖱️ Mouse Commands [#mouse-commands]
      • <h3> ⌨️ Keyboard Commands [#keyboard-commands]
      • <h3> 🔀 Control Flow [#control-flow]
      • <h3> 💾 Variables & Advanced [#variables-advanced]
    • <h2> Real-World Examples [#real-world-examples]
      • <h3> Example 1: Login Flow [#example-1-login-flow]
      • <h3> Example 2: E-commerce Shopping [#example-2-e-commerce-shopping]
      • <h3> Example 3: Form Automation with Conditions [#example-3-form-automation-with-conditions]
    • <h2> Visual Programming with Blockly [#visual-programming-with-blockly]
      • <h3> Features: [#features]
    • <h2> Advanced Features [#advanced-features]
      • <h3> Recording Mode [#recording-mode]
      • <h3> Error Handling and Debugging [#error-handling-and-debugging]
      • <h3> Integration with Crawl4AI [#integration-with-crawl4ai]
    • <h2> Best Practices [#best-practices]
      • <h3> 1. Always Wait for Elements [#1-always-wait-for-elements]
      • <h3> 2. Use Descriptive Comments [#2-use-descriptive-comments]
      • <h3> 3. Handle Variable Conditions [#3-handle-variable-conditions]
      • <h3> 4. Use Variables for Reusability [#4-use-variables-for-reusability]
    • <h2> Getting Help [#getting-help]
    • <h2> What's Next? [#whats-next]
340/core/c4a-script/
  • <h1> Installation & Setup (2023 Edition) [#installation-setup-2023-edition]
    • <h2> 1. Basic Installation [#1-basic-installation]
    • <h2> 2. Initial Setup & Diagnostics [#2-initial-setup-diagnostics]
      • <h3> 2.1 Run the Setup Command [#21-run-the-setup-command]
      • <h3> 2.2 Diagnostics [#22-diagnostics]
    • <h2> 3. Verifying Installation: A Simple Crawl (Skip this step if you already run crawl4ai-doctor) [#3-verifying-installation-a-simple-crawl-skip-this-step-if-you-already-run-crawl4ai-doctor]
    • <h2> 4. Advanced Installation (Optional) [#4-advanced-installation-optional]
      • <h3> 4.1 Torch, Transformers, or All [#41-torch-transformers-or-all]
    • <h2> 5. Docker (Experimental) [#5-docker-experimental]
    • <h2> 6. Local Server Mode (Legacy) [#6-local-server-mode-legacy]
    • <h2> Summary [#summary]
110/core/installation/
  • <h1> CrawlResult Reference [#crawlresult-reference]
    • <h2> 1. Basic Crawl Info [#1-basic-crawl-info]
      • <h3> 1.1 url (str) [#11-url-str]
      • <h3> 1.2 success (bool) [#12-success-bool]
      • <h3> 1.3 status_code (Optional[int]) [#13-status_code-optionalint]
      • <h3> 1.4 redirected_status_code (Optional[int]) [#14-redirected_status_code-optionalint]
      • <h3> 1.5 error_message (Optional[str]) [#15-error_message-optionalstr]
      • <h3> 1.5 session_id (Optional[str]) [#15-session_id-optionalstr]
      • <h3> 1.6 response_headers (Optional[dict]) [#16-response_headers-optionaldict]
      • <h3> 1.7 ssl_certificate (Optional[SSLCertificate]) [#17-ssl_certificate-optionalsslcertificate]
    • <h2> 2. Raw / Cleaned Content [#2-raw-cleaned-content]
      • <h3> 2.1 html (str) [#21-html-str]
      • <h3> 2.2 cleaned_html (Optional[str]) [#22-cleaned_html-optionalstr]
    • <h2> 3. Markdown Fields [#3-markdown-fields]
      • <h3> 3.1 The Markdown Generation Approach [#31-the-markdown-generation-approach]
      • <h3> 3.2 markdown (Optional[Union[str, MarkdownGenerationResult]]) [#32-markdown-optionalunionstr-markdowngenerationresult]
    • <h2> 4. Media & Links [#4-media-links]
      • <h3> 4.1 media (Dict[str, List[Dict]]) [#41-media-dictstr-listdict]
      • <h3> 4.2 links (Dict[str, List[Dict]]) [#42-links-dictstr-listdict]
    • <h2> 5. Additional Fields [#5-additional-fields]
      • <h3> 5.1 extracted_content (Optional[str]) [#51-extracted_content-optionalstr]
      • <h3> 5.2 downloaded_files (Optional[List[str]]) [#52-downloaded_files-optionalliststr]
      • <h3> 5.3 screenshot (Optional[str]) [#53-screenshot-optionalstr]
      • <h3> 5.4 pdf (Optional[bytes]) [#54-pdf-optionalbytes]
      • <h3> 5.5 mhtml (Optional[str]) [#55-mhtml-optionalstr]
      • <h3> 5.6 metadata (Optional[dict]) [#56-metadata-optionaldict]
    • <h2> 6. dispatch_result (optional) [#6-dispatch_result-optional]
    • <h2> 7. Network Requests & Console Messages [#7-network-requests-console-messages]
      • <h3> 7.1 network_requests (Optional[List[Dict[str, Any]]]) [#71-network_requests-optionallistdictstr-any]
      • <h3> 7.2 console_messages (Optional[List[Dict[str, Any]]]) [#72-console_messages-optionallistdictstr-any]
    • <h2> 8. Example: Accessing Everything [#8-example-accessing-everything]
    • <h2> 9. Key Points & Future [#9-key-points-future]
320/api/crawl-result/
  • <h1> Proxy & Security [#proxy-security]
    • <h2> Understanding Proxy Configuration [#understanding-proxy-configuration]
    • <h2> Basic Proxy Setup [#basic-proxy-setup]
    • <h2> Supported Proxy Formats [#supported-proxy-formats]
    • <h2> Authenticated Proxies [#authenticated-proxies]
    • <h2> Environment Variable Configuration [#environment-variable-configuration]
    • <h2> Rotating Proxies [#rotating-proxies]
      • <h3> Proxy Rotation (recommended) [#proxy-rotation-recommended]
    • <h2> SSL Certificate Analysis [#ssl-certificate-analysis]
      • <h3> Per-Request SSL Certificate Analysis [#per-request-ssl-certificate-analysis]
    • <h2> Security Best Practices [#security-best-practices]
      • <h3> 1. Proxy Rotation for Anonymity [#1-proxy-rotation-for-anonymity]
      • <h3> 2. SSL Certificate Verification [#2-ssl-certificate-verification]
      • <h3> 3. Environment Variable Security [#3-environment-variable-security]
      • <h3> 4. SOCKS5 for Enhanced Security [#4-socks5-for-enhanced-security]
    • <h2> Migration from Deprecated proxy Parameter [#migration-from-deprecated-proxy-parameter]
      • <h3> Safe Logging of Proxies [#safe-logging-of-proxies]
    • <h2> Troubleshooting [#troubleshooting]
      • <h3> Common Issues [#common-issues]
    • <h2> See Also [#see-also]
200/advanced/proxy-security/
  • <h1> Browser, Crawler & LLM Configuration (Quick Overview) [#browser-crawler-llm-configuration-quick-overview]
    • <h2> 1. BrowserConfig Essentials [#1-browserconfig-essentials]
      • <h3> Key Fields to Note [#key-fields-to-note]
      • <h3> Helper Methods [#helper-methods]
      • <h3> Class-Level Defaults [#class-level-defaults]
    • <h2> 2. CrawlerRunConfig Essentials [#2-crawlerrunconfig-essentials]
      • <h3> Key Fields to Note [#key-fields-to-note_1]
      • <h3> Helper Methods [#helper-methods_1]
    • <h2> 3. LLMConfig Essentials [#3-llmconfig-essentials]
      • <h3> Key fields to note [#key-fields-to-note_2]
    • <h2> 4. Putting It All Together [#4-putting-it-all-together]
    • <h2> 5. Next Steps [#5-next-steps]
    • <h2> 6. Conclusion [#6-conclusion]
130/core/browser-crawler-config/
  • <h1> Crawl4AI CLI Guide [#crawl4ai-cli-guide]
    • <h2> Table of Contents [#table-of-contents]
    • <h2> Installation [#installation]
    • <h2> Basic Usage [#basic-usage]
    • <h2> Quick Example of Advanced Usage [#quick-example-of-advanced-usage]
    • <h2> Configuration [#configuration]
      • <h3> Browser Configuration [#browser-configuration]
      • <h3> Crawler Configuration [#crawler-configuration]
      • <h3> Extraction Configuration [#extraction-configuration]
    • <h2> Advanced Features [#advanced-features]
      • <h3> LLM Q&A [#llm-qa]
      • <h3> Structured Data Extraction [#structured-data-extraction]
      • <h3> Content Filtering [#content-filtering]
    • <h2> Output Formats [#output-formats]
    • <h2> Complete Examples [#complete-examples]
    • <h2> Best Practices & Tips [#best-practices-tips]
    • <h2> Recap [#recap]
170/core/cli/
  • <h1> Session Management [#session-management]
    • <h2> Example 1: Basic Session-Based Crawling [#example-1-basic-session-based-crawling]
    • <h2> Advanced Technique 1: Custom Execution Hooks [#advanced-technique-1-custom-execution-hooks]
    • <h2> Advanced Technique 2: Integrated JavaScript Execution and Waiting [#advanced-technique-2-integrated-javascript-execution-and-waiting]
40/advanced/session-management/
  • <h1> arun_many(...) Reference [#arun_many-reference]
    • <h2> Function Signature [#function-signature]
    • <h2> Differences from arun() [#differences-from-arun]
      • <h3> Basic Example (Batch Mode) [#basic-example-batch-mode]
      • <h3> Streaming Example [#streaming-example]
      • <h3> With a Custom Dispatcher [#with-a-custom-dispatcher]
      • <h3> URL-Specific Configurations [#url-specific-configurations]
      • <h3> Return Value [#return-value]
    • <h2> Dispatcher Reference [#dispatcher-reference]
    • <h2> Common Pitfalls [#common-pitfalls]
    • <h2> Conclusion [#conclusion]
110/api/arun_many/
  • <h1> Advanced Multi-URL Crawling with Dispatchers [#advanced-multi-url-crawling-with-dispatchers]
    • <h2> 1. Introduction [#1-introduction]
    • <h2> 2. Core Components [#2-core-components]
      • <h3> 2.1 Rate Limiter [#21-rate-limiter]
      • <h3> 2.2 Crawler Monitor [#22-crawler-monitor]
    • <h2> 3. Available Dispatchers [#3-available-dispatchers]
      • <h3> 3.1 MemoryAdaptiveDispatcher (Default) [#31-memoryadaptivedispatcher-default]
      • <h3> 3.2 SemaphoreDispatcher [#32-semaphoredispatcher]
    • <h2> 4. Usage Examples [#4-usage-examples]
      • <h3> 4.1 Batch Processing (Default) [#41-batch-processing-default]
      • <h3> 4.2 Streaming Mode [#42-streaming-mode]
      • <h3> 4.3 Semaphore-based Crawling [#43-semaphore-based-crawling]
      • <h3> 4.4 Robots.txt Consideration [#44-robotstxt-consideration]
    • <h2> 5. Dispatch Results [#5-dispatch-results]
    • <h2> 6. URL-Specific Configurations [#6-url-specific-configurations]
      • <h3> 6.1 Basic URL Pattern Matching [#61-basic-url-pattern-matching]
      • <h3> 6.2 Advanced Pattern Matching [#62-advanced-pattern-matching]
      • <h3> 6.3 Practical Example: News Site Crawler [#63-practical-example-news-site-crawler]
      • <h3> 6.4 Best Practices [#64-best-practices]
    • <h2> 7. Summary [#7-summary]
200/advanced/multi-url-crawling/
  • <h1> Crawl Dispatcher [#crawl-dispatcher]
10/advanced/crawl-dispatcher/
  • <h1> Extracting JSON (LLM) [#extracting-json-llm]
    • <h2> 1. Why Use an LLM? [#1-why-use-an-llm]
    • <h2> 2. Provider-Agnostic via LiteLLM [#2-provider-agnostic-via-litellm]
    • <h2> 3. How LLM Extraction Works [#3-how-llm-extraction-works]
      • <h3> 3.1 Flow [#31-flow]
      • <h3> 3.2 extraction_type [#32-extraction_type]
    • <h2> 4. Key Parameters [#4-key-parameters]
    • <h2> 5. Putting It in CrawlerRunConfig [#5-putting-it-in-crawlerrunconfig]
    • <h2> 6. Chunking Details [#6-chunking-details]
      • <h3> 6.1 chunk_token_threshold [#61-chunk_token_threshold]
      • <h3> 6.2 overlap_rate [#62-overlap_rate]
      • <h3> 6.3 Performance & Parallelism [#63-performance-parallelism]
    • <h2> 7. Input Format [#7-input-format]
    • <h2> 8. Token Usage & Show Usage [#8-token-usage-show-usage]
    • <h2> 9. Example: Building a Knowledge Graph [#9-example-building-a-knowledge-graph]
    • <h2> 10. Best Practices & Caveats [#10-best-practices-caveats]
    • <h2> 11. Conclusion [#11-conclusion]
170/extraction/llm-strategies/
    00/core/ask-ai/
    • <h1> Crawl4AI LLM Context Builder
      • <h2> 🧠 A New Approach to LLM Context
        • <h3> 💡 The Solution: Multi-Dimensional, Modular Contexts
      • <h2> Select Components & Context Types
      • <h2> Available Context Files
    50/apps/llmtxt/
    • <h1> URL Seeding: The Smart Way to Crawl at Scale [#url-seeding-the-smart-way-to-crawl-at-scale]
      • <h2> Why URL Seeding? [#why-url-seeding]
        • <h3> Deep Crawling: Real-Time Discovery [#deep-crawling-real-time-discovery]
        • <h3> URL Seeding: Bulk Discovery [#url-seeding-bulk-discovery]
        • <h3> The Trade-offs [#the-trade-offs]
        • <h3> When to Use Each [#when-to-use-each]
      • <h2> Your First URL Seeding Adventure [#your-first-url-seeding-adventure]
      • <h2> Understanding the URL Seeder [#understanding-the-url-seeder]
        • <h3> Basic Usage [#basic-usage]
        • <h3> Configuration Magic: SeedingConfig [#configuration-magic-seedingconfig]
        • <h3> URL Validation: Live Checking [#url-validation-live-checking]
        • <h3> The Power of Metadata: Head Extraction [#the-power-of-metadata-head-extraction]
        • <h3> Smart URL-Based Filtering (No Head Extraction) [#smart-url-based-filtering-no-head-extraction]
        • <h3> Understanding Results [#understanding-results]
      • <h2> Smart Filtering with BM25 Scoring [#smart-filtering-with-bm25-scoring]
        • <h3> Introduction to Relevance Scoring [#introduction-to-relevance-scoring]
        • <h3> Query-Based Discovery [#query-based-discovery]
        • <h3> Real Examples [#real-examples]
      • <h2> Scaling Up: Multiple Domains [#scaling-up-multiple-domains]
        • <h3> The many_urls Method [#the-many_urls-method]
        • <h3> Cross-Domain Examples [#cross-domain-examples]
      • <h2> Advanced Integration Patterns [#advanced-integration-patterns]
        • <h3> Building a Research Assistant [#building-a-research-assistant]
        • <h3> Performance Optimization Tips [#performance-optimization-tips]
      • <h2> Best Practices & Tips [#best-practices-tips]
        • <h3> Cache Management [#cache-management]
        • <h3> Pattern Matching Strategies [#pattern-matching-strategies]
        • <h3> Rate Limiting Considerations [#rate-limiting-considerations]
      • <h2> Quick Reference [#quick-reference]
        • <h3> Common Patterns [#common-patterns]
        • <h3> Troubleshooting Guide [#troubleshooting-guide]
        • <h3> Performance Benchmarks [#performance-benchmarks]
      • <h2> Conclusion [#conclusion]
        • <h3> Smart URL Filtering [#smart-url-filtering]
        • <h3> Key Features Summary [#key-features-summary]
    350/core/url-seeding/
    • <h1> Fit Markdown with Pruning & BM25 [#fit-markdown-with-pruning-bm25]
      • <h2> 1. How “Fit Markdown” Works [#1-how-fit-markdown-works]
        • <h3> 1.1 The content_filter [#11-the-content_filter]
        • <h3> 1.2 Common Filters [#12-common-filters]
      • <h2> 2. PruningContentFilter [#2-pruningcontentfilter]
        • <h3> 2.1 Usage Example [#21-usage-example]
        • <h3> 2.2 Key Parameters [#22-key-parameters]
      • <h2> 3. BM25ContentFilter [#3-bm25contentfilter]
        • <h3> 3.1 Usage Example [#31-usage-example]
        • <h3> 3.2 Parameters [#32-parameters]
      • <h2> 4. Accessing the “Fit” Output [#4-accessing-the-fit-output]
      • <h2> 5. Code Patterns Recap [#5-code-patterns-recap]
        • <h3> 5.1 Pruning [#51-pruning]
        • <h3> 5.2 BM25 [#52-bm25]
      • <h2> 6. Combining with “word_count_threshold” & Exclusions [#6-combining-with-word_count_threshold-exclusions]
      • <h2> 7. Custom Filters [#7-custom-filters]
      • <h2> 8. Final Thoughts [#8-final-thoughts]
    170/core/fit-markdown/
    • <h1> Virtual Scroll [#virtual-scroll]
      • <h2> Understanding Virtual Scroll [#understanding-virtual-scroll]
        • <h3> The Problem [#the-problem]
        • <h3> Three Scrolling Scenarios [#three-scrolling-scenarios]
      • <h2> Basic Usage [#basic-usage]
      • <h2> Configuration Parameters [#configuration-parameters]
        • <h3> VirtualScrollConfig [#virtualscrollconfig]
        • <h3> Scroll By Options [#scroll-by-options]
      • <h2> Real-World Examples [#real-world-examples]
        • <h3> Twitter-like Timeline [#twitter-like-timeline]
        • <h3> Instagram Grid [#instagram-grid]
        • <h3> Mixed Content (News Feed) [#mixed-content-news-feed]
      • <h2> Virtual Scroll vs scan_full_page [#virtual-scroll-vs-scan_full_page]
        • <h3> When to Use Which? [#when-to-use-which]
      • <h2> Combining with Extraction [#combining-with-extraction]
      • <h2> Performance Tips [#performance-tips]
      • <h2> How It Works Internally [#how-it-works-internally]
      • <h2> Error Handling [#error-handling]
      • <h2> Complete Example [#complete-example]
    190/advanced/virtual-scroll/
    • <h1> SSLCertificate Reference [#sslcertificate-reference]
      • <h2> 1. Overview [#1-overview]
        • <h3> Typical Use Case [#typical-use-case]
      • <h2> 2. Construction & Fetching [#2-construction-fetching]
        • <h3> 2.1 from_url(url, timeout=10) [#21-from_urlurl-timeout10]
        • <h3> 2.2 from_file(file_path) [#22-from_filefile_path]
        • <h3> 2.3 from_binary(binary_data) [#23-from_binarybinary_data]
      • <h2> 3. Common Properties [#3-common-properties]
      • <h2> 4. Export Methods [#4-export-methods]
        • <h3> 4.1 to_json(filepath=None) → Optional[str] [#41-to_jsonfilepathnone-optionalstr]
        • <h3> 4.2 to_pem(filepath=None) → Optional[str] [#42-to_pemfilepathnone-optionalstr]
        • <h3> 4.3 to_der(filepath=None) → Optional[bytes] [#43-to_derfilepathnone-optionalbytes]
        • <h3> 4.4 (Optional) export_as_text() [#44-optional-export_as_text]
      • <h2> 5. Example Usage in Crawl4AI [#5-example-usage-in-crawl4ai]
      • <h2> 6. Notes & Best Practices [#6-notes-best-practices]
        • <h3> Summary [#summary]
    160/advanced/ssl-certificate/
    • <h1> AsyncWebCrawler [#asyncwebcrawler]
      • <h2> 1. Constructor Overview [#1-constructor-overview]
      • <h2> 2. Lifecycle: Start/Close or Context Manager [#2-lifecycle-startclose-or-context-manager]
        • <h3> 2.1 Context Manager (Recommended) [#21-context-manager-recommended]
        • <h3> 2.2 Manual Start & Close [#22-manual-start-close]
      • <h2> 3. Primary Method: arun() [#3-primary-method-arun]
        • <h3> 3.1 New Approach [#31-new-approach]
        • <h3> 3.2 Legacy Parameters Still Accepted [#32-legacy-parameters-still-accepted]
      • <h2> 4. Batch Processing: arun_many() [#4-batch-processing-arun_many]
        • <h3> 4.1 Resource-Aware Crawling [#41-resource-aware-crawling]
        • <h3> 4.2 Example Usage [#42-example-usage]
      • <h2> 7. Best Practices & Migration Notes [#7-best-practices-migration-notes]
      • <h2> 8. Summary [#8-summary]
    130/api/async-webcrawler/
    • <h1> Adaptive Web Crawling [#adaptive-web-crawling]
      • <h2> Introduction [#introduction]
      • <h2> Key Concepts [#key-concepts]
        • <h3> The Problem It Solves [#the-problem-it-solves]
        • <h3> How It Works [#how-it-works]
      • <h2> Quick Start [#quick-start]
        • <h3> Basic Usage [#basic-usage]
        • <h3> Configuration Options [#configuration-options]
      • <h2> Crawling Strategies [#crawling-strategies]
        • <h3> Statistical Strategy (Default) [#statistical-strategy-default]
        • <h3> Embedding Strategy [#embedding-strategy]
        • <h3> Strategy Comparison [#strategy-comparison]
        • <h3> Embedding Strategy Configuration [#embedding-strategy-configuration]
        • <h3> Handling Irrelevant Queries [#handling-irrelevant-queries]
      • <h2> When to Use Adaptive Crawling [#when-to-use-adaptive-crawling]
        • <h3> Perfect For: [#perfect-for]
        • <h3> Not Recommended For: [#not-recommended-for]
      • <h2> Understanding the Output [#understanding-the-output]
        • <h3> Confidence Score [#confidence-score]
        • <h3> Statistics Display [#statistics-display]
      • <h2> Persistence and Resumption [#persistence-and-resumption]
        • <h3> Saving Progress [#saving-progress]
        • <h3> Resuming a Crawl [#resuming-a-crawl]
        • <h3> Exporting Knowledge Base [#exporting-knowledge-base]
      • <h2> Best Practices [#best-practices]
        • <h3> 1. Query Formulation [#1-query-formulation]
        • <h3> 2. Threshold Tuning [#2-threshold-tuning]
        • <h3> 3. Performance Optimization [#3-performance-optimization]
        • <h3> 4. Link Selection [#4-link-selection]
      • <h2> Examples [#examples]
        • <h3> Research Assistant [#research-assistant]
        • <h3> Knowledge Base Builder [#knowledge-base-builder]
        • <h3> API Documentation Crawler [#api-documentation-crawler]
      • <h2> Next Steps [#next-steps]
      • <h2> FAQ [#faq]
    350/core/adaptive-crawling/
    • <h1> C4A-Script API Reference [#c4a-script-api-reference]
      • <h2> Command Categories [#command-categories]
        • <h3> 🧭 Navigation Commands [#navigation-commands]
        • <h3> ⏱️ Wait Commands [#wait-commands]
        • <h3> 🖱️ Mouse Commands [#mouse-commands]
        • <h3> ⌨️ Keyboard Commands [#keyboard-commands]
        • <h3> 🔀 Control Flow Commands [#control-flow-commands]
        • <h3> 💾 Variables and Data [#variables-and-data]
        • <h3> 📝 Comments and Documentation [#comments-and-documentation]
        • <h3> 🔧 Procedures (Advanced) [#procedures-advanced]
      • <h2> Error Handling Best Practices [#error-handling-best-practices]
        • <h3> 1. Always Use Waits [#1-always-use-waits]
        • <h3> 2. Handle Optional Elements [#2-handle-optional-elements]
        • <h3> 3. Use Descriptive Variables [#3-use-descriptive-variables]
        • <h3> 4. Add Debugging Information [#4-add-debugging-information]
      • <h2> Common Patterns [#common-patterns]
        • <h3> Login Flow [#login-flow]
        • <h3> Infinite Scroll [#infinite-scroll]
        • <h3> Form Validation [#form-validation]
        • <h3> Multi-step Process [#multi-step-process]
      • <h2> Integration with Crawl4AI [#integration-with-crawl4ai]
    210/api/c4a-script-reference/
    • <h1> Anti-Bot Detection & Fallback [#anti-bot-detection-fallback]
      • <h2> How Detection Works [#how-detection-works]
      • <h2> Configuration Options [#configuration-options]
      • <h2> Escalation Chain [#escalation-chain]
      • <h2> Crawl Stats [#crawl-stats]
      • <h2> Usage Examples [#usage-examples]
        • <h3> Simple Retry (No Proxy) [#simple-retry-no-proxy]
        • <h3> Single Proxy [#single-proxy]
        • <h3> Direct-First, Then Proxies [#direct-first-then-proxies]
        • <h3> Proxy List (Escalation) [#proxy-list-escalation]
        • <h3> Fallback Fetch Function [#fallback-fetch-function]
        • <h3> Full Escalation (All Features Combined) [#full-escalation-all-features-combined]
      • <h2> Tips [#tips]
      • <h2> See Also [#see-also]
    140/advanced/anti-bot-and-fallback/
    • <h1> Link & Media [#link-media]
      • <h2> 1. Link Extraction [#1-link-extraction]
        • <h3> 1.1 result.links [#11-resultlinks]
      • <h2> 2. Advanced Link Head Extraction & Scoring [#2-advanced-link-head-extraction-scoring]
        • <h3> 2.1 Why Link Head Extraction? [#21-why-link-head-extraction]
        • <h3> 2.2 Complete Working Example [#22-complete-working-example]
        • <h3> 2.3 Configuration Deep Dive [#23-configuration-deep-dive]
        • <h3> 2.4 Understanding the Three Score Types [#24-understanding-the-three-score-types]
        • <h3> 2.5 Practical Use Cases [#25-practical-use-cases]
        • <h3> 2.6 Performance Tips [#26-performance-tips]
        • <h3> 2.7 Troubleshooting [#27-troubleshooting]
      • <h2> 3. Domain Filtering [#3-domain-filtering]
        • <h3> 3.1 Example: Excluding External & Social Media Links [#31-example-excluding-external-social-media-links]
        • <h3> 3.2 Example: Excluding Specific Domains [#32-example-excluding-specific-domains]
      • <h2> 4. Media Extraction [#4-media-extraction]
        • <h3> 4.1 Accessing result.media [#41-accessing-resultmedia]
        • <h3> 4.2 Excluding External Images [#42-excluding-external-images]
        • <h3> 4.3 Additional Media Config [#43-additional-media-config]
      • <h2> 5. Putting It All Together: Link & Media Filtering [#5-putting-it-all-together-link-media-filtering]
      • <h2> 6. Common Pitfalls & Tips [#6-common-pitfalls-tips]
    200/core/link-media/
    • <h1> Contributing to Crawl4AI [#contributing-to-crawl4ai]
      • <h2> Core Branches [#core-branches]
      • <h2> Contributor Workflow [#contributor-workflow]
      • <h2> Lead Maintainer's Workflow (For Reference) [#lead-maintainers-workflow-for-reference]
      • <h2> Release Process (High-Level Overview) [#release-process-high-level-overview]
      • <h2> Benefits of This Approach [#benefits-of-this-approach]
      • <h2> Checklist for Contributors [#checklist-for-contributors]
      • <h2> Common Issues [#common-issues]
      • <h2> Communication [#communication]
    90/CONTRIBUTING/
    • <h1> Undetected Browser Mode [#undetected-browser-mode]
      • <h2> Overview [#overview]
      • <h2> Anti-Bot Features Comparison [#anti-bot-features-comparison]
      • <h2> When to Use Each Approach [#when-to-use-each-approach]
        • <h3> Use Regular Browser + Stealth Mode When: [#use-regular-browser-stealth-mode-when]
        • <h3> Use Undetected Browser When: [#use-undetected-browser-when]
        • <h3> Best Practice: Progressive Enhancement [#best-practice-progressive-enhancement]
      • <h2> Stealth Mode [#stealth-mode]
        • <h3> What Stealth Mode Does: [#what-stealth-mode-does]
      • <h2> Undetected Browser Mode [#undetected-browser-mode_1]
        • <h3> Key Features [#key-features]
        • <h3> Quick Start [#quick-start]
      • <h2> Combining Both Features [#combining-both-features]
      • <h2> Examples [#examples]
        • <h3> Example 1: Basic Stealth Mode [#example-1-basic-stealth-mode]
        • <h3> Example 2: Undetected Browser Mode [#example-2-undetected-browser-mode]
      • <h2> Browser Adapter Pattern [#browser-adapter-pattern]
      • <h2> Best Practices [#best-practices]
      • <h2> Advanced Usage Tips [#advanced-usage-tips]
        • <h3> Progressive Detection Handling [#progressive-detection-handling]
      • <h2> Installation [#installation]
      • <h2> Limitations [#limitations]
      • <h2> Troubleshooting [#troubleshooting]
        • <h3> Browser Not Found [#browser-not-found]
        • <h3> Detection Still Occurring [#detection-still-occurring]
        • <h3> Performance Issues [#performance-issues]
      • <h2> Future Plans [#future-plans]
      • <h2> Conclusion [#conclusion]
      • <h2> See Also [#see-also]
    290/advanced/undetected-browser/
    • <h1> Preserve Your Identity with Crawl4AI [#preserve-your-identity-with-crawl4ai]
      • <h2> 1. Managed Browsers: Your Digital Identity Solution [#1-managed-browsers-your-digital-identity-solution]
        • <h3> Key Benefits [#key-benefits]
        • <h3> Creating a User Data Directory (Command-Line Approach via Playwright) [#creating-a-user-data-directory-command-line-approach-via-playwright]
        • <h3> Creating a Profile Using the Crawl4AI CLI (Easiest) [#creating-a-profile-using-the-crawl4ai-cli-easiest]
      • <h2> 3. Using Managed Browsers in Crawl4AI [#3-using-managed-browsers-in-crawl4ai]
        • <h3> Workflow [#workflow]
      • <h2> 4. Magic Mode: Simplified Automation [#4-magic-mode-simplified-automation]
      • <h2> 5. Comparing Managed Browsers vs. Magic Mode [#5-comparing-managed-browsers-vs-magic-mode]
      • <h2> 6. Using the BrowserProfiler Class [#6-using-the-browserprofiler-class]
        • <h3> Creating and Managing Profiles with BrowserProfiler [#creating-and-managing-profiles-with-browserprofiler]
        • <h3> Interactive Profile Management [#interactive-profile-management]
        • <h3> Legacy Methods [#legacy-methods]
        • <h3> Complete Example [#complete-example]
      • <h2> 7. Locale, Timezone, and Geolocation Control [#7-locale-timezone-and-geolocation-control]
        • <h3> Setting Locale and Timezone [#setting-locale-and-timezone]
        • <h3> Configuring Geolocation [#configuring-geolocation]
        • <h3> Combining with Managed Browsers [#combining-with-managed-browsers]
      • <h2> 8. Summary [#8-summary]
    190/advanced/identity-based-crawling/
    • <h1> Code Examples [#code-examples]
      • <h2> Getting Started Examples [#getting-started-examples]
      • <h2> Proxies [#proxies]
      • <h2> Browser & Crawling Features [#browser-crawling-features]
      • <h2> Advanced Crawling & Deep Crawling [#advanced-crawling-deep-crawling]
      • <h2> Extraction Strategies [#extraction-strategies]
      • <h2> E-commerce & Specialized Crawling [#e-commerce-specialized-crawling]
      • <h2> Anti-Bot & Stealth Features [#anti-bot-stealth-features]
      • <h2> Customization & Security [#customization-security]
      • <h2> Docker & Deployment [#docker-deployment]
      • <h2> Application Examples [#application-examples]
      • <h2> Content Generation & Markdown [#content-generation-markdown]
      • <h2> Running the Examples [#running-the-examples]
      • <h2> Contributing New Examples [#contributing-new-examples]
    140/core/examples/
    • <h1> 🚀 Crawl4AI Interactive Apps [#crawl4ai-interactive-apps]
      • <h2> 🛠️ Interactive Tools for Modern Web Scraping
      • <h2> 🎯 Available Apps [#available-apps]
        • <h3> 🎨 C4A-Script Interactive Editor
        • <h3> 🧠 LLM Context Builder
        • <h3> 🕸️ Web Scraping Playground
        • <h3> 🔍 Crawl4AI Assistant (Chrome Extension)
        • <h3> 🧪 Extraction Lab
        • <h3> 🤖 AI Prompt Designer
        • <h3> 📊 Crawl Monitor
      • <h2> 🚀 Why Use These Apps? [#why-use-these-apps]
        • <h3> 🎯 Accelerate Learning [#accelerate-learning]
        • <h3> 💡 Reduce Development Time [#reduce-development-time]
        • <h3> 🔍 Improve Quality [#improve-quality]
        • <h3> 🤝 Community Driven [#community-driven]
      • <h2> 📢 Stay Updated [#stay-updated]
    160/apps/
    • <h1> Growth
      • <h2> PyPI Monthly Downloads
      • <h2> GitHub Star Growth
      • <h2> Cumulative PyPI Downloads
      • <h2> Daily Download Trend
      • <h2> GitHub Traffic (14 days)
    60/stats/
    • <h1> PDF Processing Strategies [#pdf-processing-strategies]
      • <h2> PDFCrawlerStrategy [#pdfcrawlerstrategy]
        • <h3> Overview [#overview]
        • <h3> When to Use [#when-to-use]
        • <h3> Key Methods and Their Behavior [#key-methods-and-their-behavior]
        • <h3> Example Usage [#example-usage]
        • <h3> Pros and Cons [#pros-and-cons]
      • <h2> PDFContentScrapingStrategy [#pdfcontentscrapingstrategy]
        • <h3> Overview [#overview_1]
        • <h3> When to Use [#when-to-use_1]
        • <h3> Key Configuration Attributes [#key-configuration-attributes]
        • <h3> Key Methods and Their Behavior [#key-methods-and-their-behavior_1]
        • <h3> Example Usage [#example-usage_1]
        • <h3> Pros and Cons [#pros-and-cons_1]
    140/advanced/pdf-parsing/
    • <h1> Crawl4AI Cache System and Migration Guide [#crawl4ai-cache-system-and-migration-guide]
      • <h2> Overview [#overview]
      • <h2> Old vs New Approach [#old-vs-new-approach]
        • <h3> Old Way (Deprecated) [#old-way-deprecated]
        • <h3> New Way (Recommended) [#new-way-recommended]
      • <h2> Migration Example [#migration-example]
        • <h3> Old Code (Deprecated) [#old-code-deprecated]
        • <h3> New Code (Recommended) [#new-code-recommended]
      • <h2> Common Migration Patterns [#common-migration-patterns]
    90/core/cache-modes/
    • <h1> Getting Started with Crawl4AI [#getting-started-with-crawl4ai]
      • <h2> 1. Introduction [#1-introduction]
      • <h2> 2. Your First Crawl [#2-your-first-crawl]
      • <h2> 3. Basic Configuration (Light Introduction) [#3-basic-configuration-light-introduction]
      • <h2> 4. Generating Markdown Output [#4-generating-markdown-output]
        • <h3> Example: Using a Filter with DefaultMarkdownGenerator [#example-using-a-filter-with-defaultmarkdowngenerator]
      • <h2> 5. Simple Data Extraction (CSS-based) [#5-simple-data-extraction-css-based]
      • <h2> 6. Simple Data Extraction (LLM-based) [#6-simple-data-extraction-llm-based]
      • <h2> 7. Adaptive Crawling (New!) [#7-adaptive-crawling-new]
      • <h2> 8. Multi-URL Concurrency (Preview) [#8-multi-url-concurrency-preview]
      • <h2> 8. Dynamic Content Example [#8-dynamic-content-example]
      • <h2> 9. Next Steps [#9-next-steps]
    120/core/quickstart/
    • <h1> The LLM Context Protocol: Why Your AI Assistant Needs Memory, Reasoning, and Examples [#the-llm-context-protocol-why-your-ai-assistant-needs-memory-reasoning-and-examples]
      • <h2> The Problem with Teaching Robots to Code [#the-problem-with-teaching-robots-to-code]
      • <h2> Enter the Three-Dimensional Context Protocol [#enter-the-three-dimensional-context-protocol]
        • <h3> The Three Pillars of Library Wisdom [#the-three-pillars-of-library-wisdom]
      • <h2> Why This Matters (Especially for Smaller LLMs) [#why-this-matters-especially-for-smaller-llms]
      • <h2> The Cultural DNA of Your Library [#the-cultural-dna-of-your-library]
      • <h2> Beyond Manual Documentation [#beyond-manual-documentation]
      • <h2> The Protocol, Not the Prescription [#the-protocol-not-the-prescription]
      • <h2> Try It Yourself [#try-it-yourself]
      • <h2> A Final Thought [#a-final-thought]
    100/blog/articles/llm-context-revolution/
    • <h1> Adaptive Crawling: Building Dynamic Knowledge That Grows on Demand [#adaptive-crawling-building-dynamic-knowledge-that-grows-on-demand]
      • <h2> The Knowledge Capacitor [#the-knowledge-capacitor]
      • <h2> Why I Built This [#why-i-built-this]
      • <h2> The Information Theory Foundation [#the-information-theory-foundation]
      • <h2> The A* of Web Crawling [#the-a-of-web-crawling]
      • <h2> The Three Pillars of Intelligence [#the-three-pillars-of-intelligence]
        • <h3> 1. Coverage: The Breadth Sensor [#1-coverage-the-breadth-sensor]
        • <h3> 2. Consistency: The Coherence Detector [#2-consistency-the-coherence-detector]
        • <h3> 3. Saturation: The Efficiency Guardian [#3-saturation-the-efficiency-guardian]
      • <h2> Real Impact: Time, Money, and Sanity [#real-impact-time-money-and-sanity]
        • <h3> Building a Customer Support Knowledge Base [#building-a-customer-support-knowledge-base]
      • <h2> The Dynamic Growth Pattern [#the-dynamic-growth-pattern]
      • <h2> Why "Adaptive"? [#why-adaptive]
      • <h2> The Progressive Roadmap [#the-progressive-roadmap]
        • <h3> Phase 1 (Current): Statistical Foundation [#phase-1-current-statistical-foundation]
        • <h3> Phase 2 (Now Available): Embedding Enhancement [#phase-2-now-available-embedding-enhancement]
        • <h3> Phase 3 (Future): LLM Integration [#phase-3-future-llm-integration]
      • <h2> The Efficiency Revolution [#the-efficiency-revolution]
      • <h2> Missing the Forest for the Trees [#missing-the-forest-for-the-trees]
      • <h2> Your Knowledge, On Demand [#your-knowledge-on-demand]
      • <h2> The Competitive Edge [#the-competitive-edge]
      • <h2> The Embedding Evolution (Now Available!) [#the-embedding-evolution-now-available]
        • <h3> Real-World Comparison [#real-world-comparison]
        • <h3> Detecting Irrelevance [#detecting-irrelevance]
      • <h2> Try It Yourself [#try-it-yourself]
      • <h2> A Personal Note [#a-personal-note]
      • <h2> The Future is Adaptive [#the-future-is-adaptive]
    270/blog/articles/adaptive-crawling-revolution/
    • <h1> AdaptiveCrawler [#adaptivecrawler]
      • <h2> Constructor [#constructor]
        • <h3> Parameters [#parameters]
      • <h2> Primary Method [#primary-method]
        • <h3> digest() [#digest]
      • <h2> Properties [#properties]
        • <h3> confidence [#confidence]
        • <h3> coverage_stats [#coverage_stats]
        • <h3> is_sufficient [#is_sufficient]
        • <h3> state [#state]
      • <h2> Methods [#methods]
        • <h3> get_relevant_content() [#get_relevant_content]
        • <h3> print_stats() [#print_stats]
        • <h3> export_knowledge_base() [#export_knowledge_base]
        • <h3> import_knowledge_base() [#import_knowledge_base]
      • <h2> Configuration [#configuration]
        • <h3> Example with Custom Config [#example-with-custom-config]
      • <h2> Complete Example [#complete-example]
      • <h2> See Also [#see-also]
    190/api/adaptive-crawler/
      00/core/llmtxt/
      • <h1> digest() [#digest]
        • <h2> Method Signature [#method-signature]
        • <h2> Parameters [#parameters]
          • <h3> start_url [#start_url]
          • <h3> query [#query]
          • <h3> resume_from [#resume_from]
        • <h2> Return Value [#return-value]
        • <h2> How It Works [#how-it-works]
        • <h2> Examples [#examples]
          • <h3> Basic Usage [#basic-usage]
          • <h3> With Configuration [#with-configuration]
          • <h3> Resuming a Previous Crawl [#resuming-a-previous-crawl]
          • <h3> With Progress Monitoring [#with-progress-monitoring]
        • <h2> Query Best Practices [#query-best-practices]
        • <h2> Performance Considerations [#performance-considerations]
        • <h2> Error Handling [#error-handling]
        • <h2> Stopping Conditions [#stopping-conditions]
        • <h2> See Also [#see-also]
      180/api/digest/
      No rows found, please edit your search term.

      Skipped URLs Summary

      Found 14 row(s).
      ReasonDomainUnique URLs 🔽
      Not allowed hostgithub.com14
      Not allowed hostdiscord.gg2
      Not allowed hostx.com2
      Not allowed hostwww.nstproxy.com1
      Not allowed hostdocs.litellm.ai1
      Not allowed hostpypi.org1
      Not allowed hosttrendshift.io1
      Not allowed hosttwitter.com1
      Not allowed hostwww.capsolver.com1
      Not allowed hostpepy.tech1
      Not allowed hostforms.gle1
      Not allowed hostwww.linkedin.com1
      Not allowed hostbadge.fury.io1
      Not allowed hostdeveloper.mozilla.org1
      No rows found, please edit your search term.

      Skipped URLs

      Found 29 row(s).
      ReasonSkipped URL 🔼SourceFound at URL
      Not allowed hosthttps://badge.fury.io/py/crawl4ai<a href>/
      Not allowed hosthttps://developer.mozilla.org/en-US/docs/Web/API/Web_components/Using_shadow_DOM<a href>/core/content-selection/
      Not allowed hosthttps://discord.gg/crawl4ai<a href>/core/self-hosting/
      Not allowed hosthttps://discord.gg/jP8KfhDhyN<a href>/
      Not allowed hosthttps://docs.litellm.ai/docs/providers<a href>/core/cli/
      Not allowed hosthttps://forms.gle/E9MyPaNXACnAMaqG7<a href>/
      Not allowed hosthttps://github.com/BerriAI/litellm<a href>/extraction/llm-strategies/
      Not allowed hosthttps://github.com/sponsors/unclecode<a href>/
      Not allowed hosthttps://github.com/unclecode<a href>/apps/crawl4ai-assistant/
      Not allowed hosthttps://github.com/unclecode/crawl4ai<a href>/
      Not allowed hosthttps://github.com/unclecode/crawl4ai/blob/main/LICENSE<a href>/
      Not allowed hosthttps://github.com/unclecode/crawl4ai/blob/main/docs/examples/adaptive_crawling/<a href>/core/examples/
      Not allowed hosthttps://github.com/unclecode/crawl4ai/blob/main/docs/examples/c4a_script/<a href>/core/c4a-script/
      Not allowed hosthttps://github.com/unclecode/crawl4ai/issues<a href>/core/self-hosting/
      Not allowed hosthttps://github.com/unclecode/crawl4ai/network/members<a href>/
      Not allowed hosthttps://github.com/unclecode/crawl4ai/stargazers<a href>/
      Not allowed hosthttps://github.com/unclecode/crawl4ai/tree/main/docs/examples/adaptive_crawling<a href>/core/adaptive-crawling/
      Not allowed hosthttps://github.com/unclecode/crawl4ai/tree/main/docs/examples/capsolver_captcha_solver/<a href>/core/examples/
      Not allowed hosthttps://github.com/unclecode/crawl4ai/tree/main/docs/examples/proxy<a href>/core/examples/
      Not allowed hosthttps://github.com/unclecode/crawl4ai/tree/main/docs/examples/undetectability/<a href>/core/examples/
      Not allowed hosthttps://pepy.tech/project/crawl4ai<a href>/
      Not allowed hosthttps://pypi.org/project/crawl4ai/<a href>/
      Not allowed hosthttps://trendshift.io/repositories/11716<a href>/
      Not allowed hosthttps://twitter.com/unclecode<a href>/blog/
      Not allowed hosthttps://www.capsolver.com/?utm_source=crawl4ai&utm_medium=github_pr…tm_campaign=crawl4ai_integration<a href>/core/examples/
      Not allowed hosthttps://www.linkedin.com/company/crawl4ai<a href>/
      Not allowed hosthttps://www.nstproxy.com/?utm_source=crawl4ai<a href>/core/examples/
      Not allowed hosthttps://x.com/crawl4ai<a href>/
      Not allowed hosthttps://x.com/unclecode<a href>/blog/articles/adaptive-crawling-revolution/
      No rows found, please edit your search term.

      External URLs

      29 external URL(s)
      Found 29 row(s).
      External URLPages 🔽Found on URL (max 5)
      https://badge.fury.io/py/crawl4ai1/
      https://developer.mozilla.org/en-US/docs/Web/API/Web_components/Using_shadow_DOM1/core/content-selection/
      https://discord.gg/crawl4ai1/core/self-hosting/
      https://discord.gg/jP8KfhDhyN1/
      https://docs.litellm.ai/docs/providers1/core/cli/
      https://forms.gle/E9MyPaNXACnAMaqG71/
      https://github.com/BerriAI/litellm1/extraction/llm-strategies/
      https://github.com/sponsors/unclecode1/
      https://github.com/unclecode1/apps/crawl4ai-assistant/
      https://github.com/unclecode/crawl4ai1/
      https://github.com/unclecode/crawl4ai/blob/main/LICENSE1/
      https://github.com/unclecode/crawl4ai/blob/main/docs/examples/adaptive_crawling/1/core/examples/
      https://github.com/unclecode/crawl4ai/blob/main/docs/examples/c4a_script/1/core/c4a-script/
      https://github.com/unclecode/crawl4ai/issues1/core/self-hosting/
      https://github.com/unclecode/crawl4ai/network/members1/
      https://github.com/unclecode/crawl4ai/stargazers1/
      https://github.com/unclecode/crawl4ai/tree/main/docs/examples/adaptive_crawling1/core/adaptive-crawling/
      https://github.com/unclecode/crawl4ai/tree/main/docs/examples/capsolver_captcha_solver/1/core/examples/
      https://github.com/unclecode/crawl4ai/tree/main/docs/examples/proxy1/core/examples/
      https://github.com/unclecode/crawl4ai/tree/main/docs/examples/undetectability/1/core/examples/
      https://pepy.tech/project/crawl4ai1/
      https://pypi.org/project/crawl4ai/1/
      https://trendshift.io/repositories/117161/
      https://twitter.com/unclecode1/blog/
      https://www.capsolver.com/?utm_source=crawl4ai&utm_medium=github_pr…tm_campaign=crawl4ai_integration1/core/examples/
      https://www.linkedin.com/company/crawl4ai1/
      https://www.nstproxy.com/?utm_source=crawl4ai1/core/examples/
      https://x.com/crawl4ai1/
      https://x.com/unclecode1/blog/articles/adaptive-crawling-revolution/
      No rows found, please edit your search term.

      TOP fastest URLs

      Found 20 row(s).
      Time 🔼StatusFast URL
      178 ms200 /marketplace/admin/
      178 ms200 /apps/c4a-script/
      178 ms200 /apps/llmtxt/
      178 ms200 /marketplace/
      179 ms200 /core/llmtxt/
      179 ms200 /core/ask-ai/
      179 ms200 /advanced/crawl-dispatcher/
      179 ms200 /advanced/lazy-loading/
      179 ms200 /blog/
      179 ms200 /core/examples/
      179 ms200 /core/installation/
      179 ms200 /advanced/file-downloading/
      179 ms200 /core/cache-modes/
      179 ms200 /CONTRIBUTING/
      179 ms200 /apps/
      179 ms200 /api/arun_many/
      179 ms200 /api/arun/
      179 ms200 /advanced/ssl-certificate/
      179 ms200 /stats/
      179 ms200 /blog/articles/adaptive-crawling-revolution/
      No rows found, please edit your search term.

      TOP slowest URLs

      Found 20 row(s).

      Content types

      Content typeURLs 🔽Total sizeTotal timeAvg timeStatus 20xStatus 30xStatus 40x
      HTML695 MB12 s184 ms 63 06
      Redirect2338 B357 ms178 ms 02 0

      Content types (MIME types)

      Content typeURLs 🔽Total sizeTotal timeAvg timeStatus 20xStatus 30xStatus 40x
      text / html715 MB13 s184 ms 63 2 6

      Source domains

      DomainTotalsHTMLRedirect
      docs.crawl4ai.com71 / 5MB / 13s69 / 5MB / 12s2 / 338B / 357ms

      HTTP headers

      Header 🔼OccursUniqueValues previewMin valueMax value
      Content-Length2-[ignored generic values]178 B178 B
      Content-Type711text / html
      Date71-[ignored generic values]2026-03-242026-03-24
      Etag69-[ignored generic values]
      Last-Modified63-[ignored generic values]2026-02-242026-02-24
      Location22/blog/articles/llm-context-revolution/ (1) / /api/parameters/ (1)
      Server711nginx/1.24.0 (Ubuntu)

      HTTP header values

      HeaderOccursValue
      Content-Type71text / html
      Location1/blog/articles/llm-context-revolution/
      Location1/api/parameters/
      Server71nginx/1.24.0 (Ubuntu)

      HTTP Caching by content type (only from crawlable domains)

      Content typeCache typeURLs 🔽AVG lifetimeMIN lifetimeMAX lifetime
      HTMLETag + Last-Modified63---
      HTMLETag6---
      RedirectNo cache headers2---

      HTTP Caching by domain

      DomainCache typeURLs 🔽AVG lifetimeMIN lifetimeMAX lifetime
      docs.crawl4ai.comETag + Last-Modified63---
      docs.crawl4ai.comETag6---
      docs.crawl4ai.comNo cache headers2---

      HTTP Caching by domain and content type

      DomainContent typeCache typeURLs 🔽AVG lifetimeMIN lifetimeMAX lifetime
      docs.crawl4ai.comHTMLETag + Last-Modified63---
      docs.crawl4ai.comHTMLETag6---
      docs.crawl4ai.comRedirectNo cache headers2---

      DNS info

      DNS resolving tree
      docs.crawl4ai.com
        IPv4: 35.163.245.47
      DNS server: 127.0.0.53

      SSL/TLS info

      InfoText
      IssuerC = US, O = Let's Encrypt, CN = E8
      SubjectCN = crawl4ai.com
      Valid fromMar 10 12:36:46 2026 GMT (VALID already 14.1 day(s))
      Valid toJun  8 12:36:45 2026 GMT (VALID still for 75.9 day(s))
      Supported protocolsTLSv1.2, TLSv1.3
      RAW certificate outputCertificate:
          Data:
              Version: 3 (0x2)
              Serial Number:
                  06:f2:74:ad:9d:05:b0:68:46:4f:8f:8e:89:00:12:c0:d5:84
              Signature Algorithm: ecdsa-with-SHA384
              Issuer: C = US, O = Let's Encrypt, CN = E8
              Validity
                  Not Before: Mar 10 12:36:46 2026 GMT
                  Not After : Jun  8 12:36:45 2026 GMT
              Subject: CN = crawl4ai.com
              Subject Public Key Info:
                  Public Key Algorithm: id-ecPublicKey
                      Public-Key: (256 bit)
                      pub:
                          04:2a:97:9d:3a:5c:1f:b9:a3:ae:ff:53:f5:37:dc:
                          d7:8a:32:51:8a:0f:b1:19:47:b2:54:47:fd:0c:77:
                          9d:a0:f1:73:e5:73:f2:67:16:55:ce:7f:bb:6d:64:
                          8e:61:f3:7a:2c:c4:85:25:08:56:a5:82:ed:c0:c3:
                          96:9f:28:76:0b
                      ASN1 OID: prime256v1
                      NIST CURVE: P-256
              X509v3 extensions:
                  X509v3 Key Usage: critical
                      Digital Signature
                  X509v3 Extended Key Usage: 
                      TLS Web Server Authentication
                  X509v3 Basic Constraints: critical
                      CA:FALSE
                  X509v3 Subject Key Identifier: 
                      C2:72:71:3C:DD:F0:5B:A2:8B:21:56:F0:CA:8E:1B:74:B4:78:FB:7C
                  X509v3 Authority Key Identifier: 
                      8F:0D:13:A2:F6:2E:7E:D1:50:6C:33:18:38:5D:59:8E:23:72:91:CA
                  Authority Information Access: 
                      CA Issuers - URI:http://e8.i.lencr.org/
                  X509v3 Subject Alternative Name: 
                      DNS:*.crawl4ai.com, DNS:crawl4ai.com
                  X509v3 Certificate Policies: 
                      Policy: 2.23.140.1.2.1
                  X509v3 CRL Distribution Points: 
                      Full Name:
                        URI:http://e8.c.lencr.org/5.crl
                  CT Precertificate SCTs: 
                      Signed Certificate Timestamp:
                          Version   : v1 (0x0)
                          Log ID    : 64:11:C4:6C:A4:12:EC:A7:89:1C:A2:02:2E:00:BC:AB:
                                      4F:28:07:D4:1E:35:27:AB:EA:FE:D5:03:C9:7D:CD:F0
                          Timestamp : Mar 10 13:35:17.025 2026 GMT
                          Extensions: none
                          Signature : ecdsa-with-SHA256
                                      30:46:02:21:00:FC:AC:1A:0A:17:DB:44:E0:BD:24:2E:
                                      B1:B3:3E:71:DC:B8:D2:08:54:2A:1A:55:23:9A:44:3E:
                                      10:4F:2E:E4:30:02:21:00:97:CC:45:A7:B7:91:00:B1:
                                      61:A8:5F:EE:D6:B8:E9:F2:1D:2B:A2:2A:EB:03:B7:9C:
                                      62:6E:BF:EA:35:57:BE:77
                      Signed Certificate Timestamp:
                          Version   : v1 (0x0)
                          Log ID    : E3:23:8D:F2:8D:A2:88:E0:AA:E0:AC:F0:FA:90:C9:85:
                                      F0:B6:BF:F5:D2:A5:27:B0:01:FC:1C:44:58:C4:B6:E8
                          Timestamp : Mar 10 13:35:17.523 2026 GMT
                          Extensions: 00:00:05:00:35:22:1A:1F
                          Signature : ecdsa-with-SHA256
                                      30:44:02:20:6C:64:1B:8E:B0:AE:C1:92:21:D3:22:72:
                                      B3:90:37:D5:4F:72:FE:3B:B4:46:28:C6:D3:8F:AE:11:
                                      71:67:CD:B9:02:20:74:99:7C:D2:21:58:57:29:5B:5B:
                                      D4:9E:CB:82:3C:49:6E:BB:61:13:2A:56:70:9C:16:75:
                                      3B:33:D0:5D:DB:47
          Signature Algorithm: ecdsa-with-SHA384
          Signature Value:
              30:66:02:31:00:9b:60:6e:88:9b:f4:21:38:8b:54:a5:a3:52:
              11:50:14:53:c6:de:4d:fd:66:e0:34:94:1c:c9:b8:bf:1d:a6:
              4c:9f:6c:c5:7e:8d:c6:c9:f2:30:3c:b0:16:d4:7a:a7:63:02:
              31:00:f0:99:3e:7d:45:44:d0:92:e5:a2:3c:20:b3:c7:24:f1:
              f1:18:42:1f:29:27:a1:28:54:9b:44:7a:03:83:20:e5:7e:00:
              e5:da:14:ef:d9:2a:f7:86:1c:78:b9:92:b6:83
      RAW protocols output
      === ssl2 ===
      s_client: Unknown option: -ssl2
      s_client: Use -help for summary.

      === ssl3 ===
      s_client: Unknown option: -ssl3
      s_client: Use -help for summary.

      === tls1 ===
      40770DD795760000:error:0A0000BF:SSL routines:tls_setup_handshake:no protocols available:../ssl/statem/statem_lib.c:104:
      CONNECTED(00000003)
      ---
      no peer certificate available
      ---
      No client certificate CA names sent
      ---
      SSL handshake has read 0 bytes and written 7 bytes
      Verification: OK
      ---
      New, (NONE), Cipher is (NONE)
      Secure Renegotiation IS NOT supported
      Compression: NONE
      Expansion: NONE
      No ALPN negotiated
      Early data was not sent
      Verify return code: 0 (ok)
      ---

      === tls1_1 ===
      408781F13D7F0000:error:0A0000BF:SSL routines:tls_setup_handshake:no protocols available:../ssl/statem/statem_lib.c:104:
      CONNECTED(00000003)
      ---
      no peer certificate available
      ---
      No client certificate CA names sent
      ---
      SSL handshake has read 0 bytes and written 7 bytes
      Verification: OK
      ---
      New, (NONE), Cipher is (NONE)
      Secure Renegotiation IS NOT supported
      Compression: NONE
      Expansion: NONE
      No ALPN negotiated
      Early data was not sent
      Verify return code: 0 (ok)
      ---

      === tls1_2 ===
      depth=2 C = US, O = Internet Security Research Group, CN = ISRG Root X1
      verify return:1
      depth=1 C = US, O = Let's Encrypt, CN = E8
      verify return:1
      depth=0 CN = crawl4ai.com
      verify return:1
      CONNECTED(00000003)
      ---
      Certificate chain
       0 s:CN = crawl4ai.com
         i:C = US, O = Let's Encrypt, CN = E8
         a:PKEY: id-ecPublicKey, 256 (bit); sigalg: ecdsa-with-SHA384
         v:NotBefore: Mar 10 12:36:46 2026 GMT; NotAfter: Jun  8 12:36:45 2026 GMT
       1 s:C = US, O = Let's Encrypt, CN = E8
         i:C = US, O = Internet Security Research Group, CN = ISRG Root X1
         a:PKEY: id-ecPublicKey, 384 (bit); sigalg: RSA-SHA256
         v:NotBefore: Mar 13 00:00:00 2024 GMT; NotAfter: Mar 12 23:59:59 2027 GMT
      ---
      Server certificate
      -----BEGIN CERTIFICATE-----
      MIIDkzCCAxigAwIBAgISBvJ0rZ0FsGhGT4+OiQASwNWEMAoGCCqGSM49BAMDMDIx
      CzAJBgNVBAYTAlVTMRYwFAYDVQQKEw1MZXQncyBFbmNyeXB0MQswCQYDVQQDEwJF
      ODAeFw0yNjAzMTAxMjM2NDZaFw0yNjA2MDgxMjM2NDVaMBcxFTATBgNVBAMTDGNy
      YXdsNGFpLmNvbTBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABCqXnTpcH7mjrv9T
      9Tfc14oyUYoPsRlHslRH/Qx3naDxc+Vz8mcWVc5/u21kjmHzeizEhSUIVqWC7cDD
      lp8odgujggInMIICIzAOBgNVHQ8BAf8EBAMCB4AwEwYDVR0lBAwwCgYIKwYBBQUH
      AwEwDAYDVR0TAQH/BAIwADAdBgNVHQ4EFgQUwnJxPN3wW6KLIVbwyo4bdLR4+3ww
      HwYDVR0jBBgwFoAUjw0TovYuftFQbDMYOF1ZjiNykcowMgYIKwYBBQUHAQEEJjAk
      MCIGCCsGAQUFBzAChhZodHRwOi8vZTguaS5sZW5jci5vcmcvMCcGA1UdEQQgMB6C
      DiouY3Jhd2w0YWkuY29tggxjcmF3bDRhaS5jb20wEwYDVR0gBAwwCjAIBgZngQwB
      AgEwLAYDVR0fBCUwIzAhoB+gHYYbaHR0cDovL2U4LmMubGVuY3Iub3JnLzUuY3Js
      MIIBDAYKKwYBBAHWeQIEAgSB/QSB+gD4AHcAZBHEbKQS7KeJHKICLgC8q08oB9Qe
      NSer6v7VA8l9zfAAAAGc1/WCIQAABAMASDBGAiEA/KwaChfbROC9JC6xsz5x3LjS
      CFQqGlUjmkQ+EE8u5DACIQCXzEWnt5EAsWGoX+7WuOnyHSuiKusDt5xibr/qNVe+
      dwB9AOMjjfKNoojgquCs8PqQyYXwtr/10qUnsAH8HERYxLboAAABnNf1hBMACAAA
      BQA1IhofBAMARjBEAiBsZBuOsK7BkiHTInKzkDfVT3L+O7RGKMbTj64RcWfNuQIg
      dJl80iFYVylbW9Sey4I8SW67YRMqVnCcFnU7M9Bd20cwCgYIKoZIzj0EAwMDaQAw
      ZgIxAJtgboib9CE4i1Slo1IRUBRTxt5N/WbgNJQcybi/HaZMn2zFfo3GyfIwPLAW
      1HqnYwIxAPCZPn1FRNCS5aI8ILPHJPHxGEIfKSehKFSbRHoDgyDlfgDl2hTv2Sr3
      hhx4uZK2gw==
      -----END CERTIFICATE-----
      subject=CN = crawl4ai.com
      issuer=C = US, O = Let's Encrypt, CN = E8
      ---
      No client certificate CA names sent
      Peer signing digest: SHA256
      Peer signature type: ECDSA
      Server Temp Key: X25519, 253 bits
      ---
      SSL handshake has read 2333 bytes and written 307 bytes
      Verification: OK
      ---
      New, TLSv1.2, Cipher is ECDHE-ECDSA-AES256-GCM-SHA384
      Server public key is 256 bit
      Secure Renegotiation IS supported
      Compression: NONE
      Expansion: NONE
      No ALPN negotiated
      SSL-Session:
          Protocol  : TLSv1.2
          Cipher    : ECDHE-ECDSA-AES256-GCM-SHA384
          Session-ID: DFE1C4DB229AFA8521FB5C4855F36780058A4A3B70715491B6994C0C1C85CDEB
          Session-ID-ctx: 
          Master-Key: 4EA712BE001A51648BCE5A833718F315518029390411DA10C44E7637F51F0E5C293AA6E74713EDF56624410DCF1B7C46
          PSK identity: None
          PSK identity hint: None
          SRP username: None
          Start Time: 1774362816
          Timeout   : 7200 (sec)
          Verify return code: 0 (ok)
          Extended master secret: yes
      ---
      DONE

      === tls1_3 ===
      depth=2 C = US, O = Internet Security Research Group, CN = ISRG Root X1
      verify return:1
      depth=1 C = US, O = Let's Encrypt, CN = E8
      verify return:1
      depth=0 CN = crawl4ai.com
      verify return:1
      CONNECTED(00000003)
      ---
      Certificate chain
       0 s:CN = crawl4ai.com
         i:C = US, O = Let's Encrypt, CN = E8
         a:PKEY: id-ecPublicKey, 256 (bit); sigalg: ecdsa-with-SHA384
         v:NotBefore: Mar 10 12:36:46 2026 GMT; NotAfter: Jun  8 12:36:45 2026 GMT
       1 s:C = US, O = Let's Encrypt, CN = E8
         i:C = US, O = Internet Security Research Group, CN = ISRG Root X1
         a:PKEY: id-ecPublicKey, 384 (bit); sigalg: RSA-SHA256
         v:NotBefore: Mar 13 00:00:00 2024 GMT; NotAfter: Mar 12 23:59:59 2027 GMT
      ---
      Server certificate
      -----BEGIN CERTIFICATE-----
      MIIDkzCCAxigAwIBAgISBvJ0rZ0FsGhGT4+OiQASwNWEMAoGCCqGSM49BAMDMDIx
      CzAJBgNVBAYTAlVTMRYwFAYDVQQKEw1MZXQncyBFbmNyeXB0MQswCQYDVQQDEwJF
      ODAeFw0yNjAzMTAxMjM2NDZaFw0yNjA2MDgxMjM2NDVaMBcxFTATBgNVBAMTDGNy
      YXdsNGFpLmNvbTBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABCqXnTpcH7mjrv9T
      9Tfc14oyUYoPsRlHslRH/Qx3naDxc+Vz8mcWVc5/u21kjmHzeizEhSUIVqWC7cDD
      lp8odgujggInMIICIzAOBgNVHQ8BAf8EBAMCB4AwEwYDVR0lBAwwCgYIKwYBBQUH
      AwEwDAYDVR0TAQH/BAIwADAdBgNVHQ4EFgQUwnJxPN3wW6KLIVbwyo4bdLR4+3ww
      HwYDVR0jBBgwFoAUjw0TovYuftFQbDMYOF1ZjiNykcowMgYIKwYBBQUHAQEEJjAk
      MCIGCCsGAQUFBzAChhZodHRwOi8vZTguaS5sZW5jci5vcmcvMCcGA1UdEQQgMB6C
      DiouY3Jhd2w0YWkuY29tggxjcmF3bDRhaS5jb20wEwYDVR0gBAwwCjAIBgZngQwB
      AgEwLAYDVR0fBCUwIzAhoB+gHYYbaHR0cDovL2U4LmMubGVuY3Iub3JnLzUuY3Js
      MIIBDAYKKwYBBAHWeQIEAgSB/QSB+gD4AHcAZBHEbKQS7KeJHKICLgC8q08oB9Qe
      NSer6v7VA8l9zfAAAAGc1/WCIQAABAMASDBGAiEA/KwaChfbROC9JC6xsz5x3LjS
      CFQqGlUjmkQ+EE8u5DACIQCXzEWnt5EAsWGoX+7WuOnyHSuiKusDt5xibr/qNVe+
      dwB9AOMjjfKNoojgquCs8PqQyYXwtr/10qUnsAH8HERYxLboAAABnNf1hBMACAAA
      BQA1IhofBAMARjBEAiBsZBuOsK7BkiHTInKzkDfVT3L+O7RGKMbTj64RcWfNuQIg
      dJl80iFYVylbW9Sey4I8SW67YRMqVnCcFnU7M9Bd20cwCgYIKoZIzj0EAwMDaQAw
      ZgIxAJtgboib9CE4i1Slo1IRUBRTxt5N/WbgNJQcybi/HaZMn2zFfo3GyfIwPLAW
      1HqnYwIxAPCZPn1FRNCS5aI8ILPHJPHxGEIfKSehKFSbRHoDgyDlfgDl2hTv2Sr3
      hhx4uZK2gw==
      -----END CERTIFICATE-----
      subject=CN = crawl4ai.com
      issuer=C = US, O = Let's Encrypt, CN = E8
      ---
      No client certificate CA names sent
      Peer signing digest: SHA256
      Peer signature type: ECDSA
      Server Temp Key: X25519, 253 bits
      ---
      SSL handshake has read 2413 bytes and written 331 bytes
      Verification: OK
      ---
      New, TLSv1.3, Cipher is TLS_AES_256_GCM_SHA384
      Server public key is 256 bit
      Secure Renegotiation IS NOT supported
      Compression: NONE
      Expansion: NONE
      No ALPN negotiated
      Early data was not sent
      Verify return code: 0 (ok)
      ---
      DONE

      Crawler stats

      Basic stats
      Total execution time12 s
      Total URLs71
      Total size5 MB
      Requests - total time13 s
      Requests - avg time185 ms
      Requests - min time179 ms
      Requests - max time360 ms
      Requests by status200: 63
      301: 2
      404: 6

      Analysis stats

      Found 21 row(s).
      Class::methodExec time 🔽Exec count
      SslTlsAnalyzer::getTLSandSSLCertificateInfo3.1 s 1
      AccessibilityAnalyzer::checkMissingAriaLabels289 ms 63
      AccessibilityAnalyzer::checkMissingLabels265 ms 63
      BestPracticeAnalyzer::checkHeadingStructure233 ms 69
      AccessibilityAnalyzer::checkMissingRoles225 ms 63
      AccessibilityAnalyzer::checkMissingLang199 ms 63
      BestPracticeAnalyzer::checkMaxDOMDepth194 ms 69
      BestPracticeAnalyzer::checkNonClickablePhoneNumbers73 ms 69
      BestPracticeAnalyzer::checkMissingQuotesOnAttributes15 ms 69
      SeoAndOpenGraphAnalyzer::analyzeHeadings8 ms 1
      SecurityAnalyzer::checkHtmlSecurity8 ms 69
      AccessibilityAnalyzer::checkImageAltAttributes5 ms 63
      BestPracticeAnalyzer::checkInlineSvg5 ms 69
      SecurityAnalyzer::checkHeaders1 ms 69
      SeoAndOpenGraphAnalyzer::analyzeSeo0 ms 1
      BestPracticeAnalyzer::checkTitleUniqueness0 ms 1
      BestPracticeAnalyzer::checkMetaDescriptionUniqueness0 ms 1
      SeoAndOpenGraphAnalyzer::analyzeOpenGraph0 ms 1
      BestPracticeAnalyzer::checkBrotliSupport0 ms 1
      BestPracticeAnalyzer::checkWebpSupport0 ms 1
      BestPracticeAnalyzer::checkAvifSupport0 ms 1
      No rows found, please edit your search term.

      Content processor stats

      Found 12 row(s).
      Class::methodExec time 🔽Exec count
      HtmlProcessor::findUrls89 ms 71
      NextJsProcessor::applyContentChangesBeforeUrlParsing36 ms 69
      JavaScriptProcessor::findUrls34 ms 69
      CssProcessor::findUrls3 ms 69
      AstroProcessor::findUrls0 ms 69
      AstroProcessor::applyContentChangesBeforeUrlParsing0 ms 69
      NextJsProcessor::findUrls0 ms 69
      JavaScriptProcessor::applyContentChangesBeforeUrlParsing0 ms 69
      HtmlProcessor::applyContentChangesBeforeUrlParsing0 ms 71
      SvelteProcessor::findUrls0 ms 69
      SvelteProcessor::applyContentChangesBeforeUrlParsing0 ms 69
      CssProcessor::applyContentChangesBeforeUrlParsing0 ms 69
      No rows found, please edit your search term.

      Crawler info

      Version 2.1.0.20260317
      Executed At 2026-03-24 14:33:25
      Command siteone-crawler --url=https://docs.crawl4ai.com --markdown-export-dir=/tmp/siteone-crawl4ai --markdown-exclude-selector=header,footer,nav,.sidebar,.menu,.breadcrumb,script,style --timeout=30 --workers=5 --disable-javascript --disable-styles --disable-fonts --disable-images --disable-files --no-color --hide-progress-bar --output=text
      Hostname ubuntu-8gb-hel1-1
      User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/26.0.0.0 Safari/537.36 siteone-crawler/2.1.0.20260317