List of bots & crawlers

One of our clients is blocking Google Bots at server level since some time. From the moment this was introduced we notice a big drop in organic traffic in PiwikPRO. Our understanding is that PiwikPRO filters Google Bot traffic by default. We therefore should not expect to see such a drop. is there a list availalbe of which bots & crawlers that PiwikPRO is filtering out by default? This will help us troubleshoot the issue.

hello @LoneWolves

I will ask if we have such list of bots and cralwers but I don’t think its going to be useful in this case.

Sounds like you blocked Google Bot from crawling website and maybe it had direct impact on your organic traffic?

Thank you @jfidala! You do not find it strange then that we see drop in organic traffic in PiwikPRO, whereas PiwikPRO claims to already filter out Google Bot. If that really is the case I would expect to see no such drop in organic traffic in PiwikPRO. Hence our request for such a list so we can evaluate what exactly is being filtered and what is not. This will be extremely useful to us.

@LoneWolves That traffic could be normal organic traffic - not bot traffic. It depends on your overall setup but blocking Google bot in some scenarios could cause some issues with SEO, and therefore lower or remove organic traffic.

I cannot advise on marketing and SEO but I would suggest checking it out with team if pages are indexing.

Bots wouldn’t be visible in organic traffic, and I don’t think there is other way how blocking bots could lower organic traffic in reports.

It seems we are saying the same thing: that blocking bots should not lower organic traffic in PWP reports. Yet, it did. So I’m trying to understand how this can happen in PWP. If we can hold the list of what PWP is blocking against the list that is blocked on the server this will tell us what is going on. So can we please get such a list?

@LoneWolves
My theory is:

  • Blocking bots happened on server → Google bot has no access to website → Some other settings + lack of access by Google Bot might have deindexed pages → Traffic from organic is lost because website isn’t visible in Google → organic traffic dropped in reports in Piwik PRO

  • If Blocking bots happened on server → Doesn’t change traffic in Piwik PRO, it should be already blocked, therefore no change in data.

  • If by any chance there is some new bot that is accessing website it wouldn’t be from organic google traffic - more likely it will be assigned as direct / none.

I’ve got information that we cannot share official list of blocked bots for various technical and privacy reasons.

“If Blocking bots happened on server → Doesn’t change traffic in Piwik PRO, it should be already blocked, therefore no change in data.”

The whole things is that the traffic DID change in PiwikPRO. About 1/3 of SEO traffic was lost. You can say it shouldn’t have, which was precisely my assumption, but it did change nonetheless. I attach a PWP screenshot where you can clearly see organic traffic went down as from September, we can even point ot a specific date when the server restrictions were put in place. We are hoping you would help us understand what is going on here.

Yes, but that’s your organic traffic. It shouldn’t change bot traffic - as it’s not tracked.

We are hoping you would help us understand what is going on here.

Issue is unrelated to Piwik PRO. Most likely it’s SEO issue, not a tracking issue.

You should use SEO tools and check if indexing is working, verify your settings. As I mentioned before I cannot advise you on SEO - but most likely you blocked Google Bot and you lost traffic from google com.

@LoneWolves there’s a very simple explanation for this. If you block googlebot they will start to deindex your whole site and your organic traffic will be tanking to 0 in a matter of weeks :slight_smile:

PWP is showing you the traffic that comes from google, not googlebot traffic.

I appreciate the thinking along! But this is not quite what is happening here. Googlebot wasn’t blocked completely. Only the number of requests it could make was capped.

DIfficult to say without seeing the case but wondering what’s the main reason behind capping Google. In the industry for over 20 years and never ever saw a case where somebody came up with this idea so very curious :slight_smile:

@dave The website in question featured faceted search. What seems to have happened is that Googlebot was making a multitude of requests each time with different parameter combinations from the various search filters.