Babbar.tech is operating a crawler service named Barkrowler which fuels and update our graph representation of the world wide web. This database and all the metrics we compute with are used to provide a set of online marketing and referencing tools for the SEO community.
Barkrowler crawls urls found on public pages and thus may be visiting each page which have been publicly cited somewhere.
Yes, we keep trying to crawl those pages just to be sure that a missing pages doesn't reflect a temporary state or a faulty web server.
Google introduce No follow links to let a site indicate that some pages must not be take into account when computing web metrics. But it doesn't prevent a bot to crawl these pages.
Yes. We respect robots.txt file (using crawler-commons tool set) and disallow directives. If you have the feeling that we do not respect your directive, please contact-us.
We have a politeness policy of 5 sec between two queries on the same host, and 2.5 sec between two queries on the same IP of the same domain. You can extend the crawl delay using the robots.txt file:
User-agent: barkrowler Crawl-Delay: [delayInSec]
Note that the crawl delay only applied for a given host. If a same web server is hosting websites with different domains, the rules above will apply. If your server is hosting a large number of website with a large number of separate domains, it'a unlikely but possible that several crawlers query the same server at a given time.
The robots.txt file allows you to disallow Barkrowler to crawl a part or the whole of your website using disallow directive. For example, to prevent the wordpress admin section to be access by Barkrowler:
User-agent: barkrowler Disallow: /wp-admin/
Crawled content are not stored in our database, we mainly kept links and meta-information about web pages. No nominative data are stored in our database neither.
Even if Barkrowler crawls web pages with a reasonable delay (2,5 or 5 sec between queries), It is sometimes mistaken for a DDOS or a Brute force attack. If we found an url containing session parameters, it could also be considered as a login attempt. For these reasons, Barkrowler may be temporary blacklisted. In this case, you may try to whitelist Barkrowler directly in your plugin, or contact us if you can't.