Back

Crawler Documentation

OctogenBot

Last updated: May 20, 2026

OctogenBot is the public crawler identity used by Octogen Systems, Inc. Our customers are brands, retailers and agentic-shopping applications. Octogen's crawler reads ecommerce products. The platform then restructures those products into a common enriched schema to make it easier for LLMs to understand the crawled catalog.

User Agent

Requests signed with Octogen's Web Bot Auth identity use this user-agent string:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; OctogenBot/1.0; +https://octogen.ai/bots

Unsigned crawler traffic may use ordinary browser user agents. The OctogenBot identity is reserved for configured Web Bot Auth traffic.

Bot Details

Bot name
OctogenBot
Operator
Octogen Systems, Inc.
Purpose
Product discovery, catalog interpretation, and AI shopping readiness analysis
Verification
Cloudflare Web Bot Auth using HTTP Message Signatures
Key directory
https://bots.octogen.ai/.well-known/http-message-signatures-directory
Contact
crawler@octogen.ai

Verification

Octogen uses Cloudflare Web Bot Auth for eligible crawls. Signed requests include HTTP Message Signature headers and a Signature-Agent header that points to Octogen's public key directory.

https://bots.octogen.ai/.well-known/http-message-signatures-directory

Site operators should prefer Web Bot Auth verification over static IP allowlists, since crawl egress can change as infrastructure changes.

Crawler Purpose

OctogenBot reads publicly available ecommerce pages so Octogen can understand how AI shopping agents interpret product catalogs, identify missing product data, and help merchants make their catalogs easier for agents to understand.

  • Public product page URLs and page metadata
  • Product names, descriptions, images, prices, availability, variants, and attributes
  • Public structured data such as Schema.org Product markup
  • Public sitemap and robots.txt signals used to scope crawl behavior

Boundaries

  • OctogenBot does not try to access authenticated, paywalled, or private content.
  • OctogenBot does not collect personal shopper account data from retailer sites.
  • OctogenBot does not use signed bot identity unless the crawl is configured for Web Bot Auth.
  • OctogenBot is intended for commercial product-data workflows, not search-engine indexing.

Robots.txt

To block OctogenBot, add this rule to your robots.txt file:

User-agent: OctogenBot Disallow: /

To limit crawling to public product pages, use path-specific rules:

User-agent: OctogenBot Disallow: /account/ Disallow: /checkout/ Crawl-delay: 10

Contact

For crawl questions, allowlist requests, robots.txt issues, or urgent operational concerns, email crawler@octogen.ai.