About SiteVitals Bot
SiteVitals Bot is our automated web crawler used to collect SEO and asset-related metrics for websites registered in our system. The bot operates under the following User-Agent string:
Mozilla/5.0 (compatible; SiteVitals Bot/1.0; +https://www.sitevitals.co.uk/bot-info)
Crawl Scope
Depending on the services enabled for your site, SiteVitals Bot may crawl the following content types:
- HTML pages, meta tags, headings (H1s, etc.)
- Open Graph and Twitter Card metadata
- Links (internal and external) and broken link detection
- Assets such as images, CSS, JavaScript, and PDFs
- Robots.txt files and meta robots directives
- Structured data (Schema.org JSON-LD or Microdata)
Crawl Frequency and Delay
The frequency of crawls depends on your plan and the type of check:
- Critical checks like SEO analysis run once daily.
- Asset and broken link checks may run multiple requests in succession.
Respecting Robots.txt
SiteVitals Bot fully respects your robots.txt directives using the Spatie RobotsTxt library. This includes:
- Disallowed paths (
Disallowdirectives) - Crawl-delay directives
- Meta robots tags on individual pages (e.g.,
noindex)
If your site blocks certain pages via robots.txt, the bot will not crawl those pages and your SiteVitals scans will respect these exclusions.
Data Collected
The bot collects metrics relevant to the type of scan being performed. This may include:
- Page titles and descriptions
- Headings and canonical links
- Asset availability and broken link status
- Structured data (Schema.org)
- HTTP/HTTPS redirect status
- Open Graph and Twitter card metadata
Sensitive user data (form inputs, passwords, etc.) is not collected by SiteVitals Bot.
Blocking or Contact
To block SiteVitals Bot from crawling your site, update your robots.txt file. For further inquiries, data removal requests, or questions, please contact us via: [email protected].
SiteVitals Bot is operated responsibly, respects crawl delays, and aims to minimize impact on your website’s performance.