Cobaltmetrics Bot

CobaltmetricsBot is the bot that we send out to discover and index new and updated data on various websites.

How Cobaltmetrics crawls the web

There is a short list of websites that Cobaltmetrics indexes completely, because we extract all citations in their pages. This list currently includes all Wikimedia projects, all StackExchange websites, and Hypothesis. More information about how and how often we crawl these primary data sources coming soon.

Apart from the primary sources, data collected by CobaltmetricsBot is limited to metadata from the <head> element of HTML pages. For these websites, we are only interested in URLs and other identifiers, typically in groups of URLs that locate the same document (see URI transmutation for more details). We do not process or store any information from the <body> element of HTML pages, and we do not access any resources other than HTML pages.

While CobaltmetricsBot is both harmless and beneficial, you might want to prevent it from accessing some resources on your website. The best way to do so is to add or update the robots.txt configuration of your website (see for more information). The full user-agent string is CobaltmetricsBot/1.0 (+ Where several user-agents are recognized in the robots.txt file, CobaltmetricsBot will follow the most specific.


If you have any questions about CobaltmetricsBot, please contact us and we will respond as soon as possible.

If you think that CobaltmetricsBot does not obey your robots.txt configuration, please contact us with your URLs, the logs showing CobaltmetricsBot accessing pages that it was not supposed to, and we will work quickly to fix the issue.