The data collection limit in Piwik is about to be exceeded because there is a bot visiting the site and generating excluded events. There have been over 250,000 of these events collected in February. I believe that these excluded events from crawlers should not be counted, considering one of Piwik’s advertised features is 'ignoring spam traffic.’ So, I don’t understand why these events are included in the limit?
Also, why prefetch requests are counted into the limit? I think those should also not be counted if they are excluded events.
Hi @analyticskeener,
Could you send me your Piwik PRO account name/URL in a private message?
1 Like
Sure! thank you for a quick reply! 
Hello,
This is still apparently as I have client website experiencing it:
PP’s bot detection is letting some bots/crawlers through.
This client’s site is local mover in Columbus, Ohio and bot traffic is accounting for the majority of events on his site.
So far, this month, the site has 697 sessions from Germany (pretty much impossible given his very narrow geographic focus) and Council Bluffs, Iowa which is a known Google crawler location.
Since Piwik Pro does not filter out such obvious bot/crawler traffic before processing, these are counted as events. Without them being processed, his site’s total events would be under 25K per month but with them included, but with these bots, the events was closer to 50K.
I looked at the process for excluding additional bots (https://help.piwik.pro/support/questions/how-can-i-ignore-traffic-from-bots-and-crawlers/) but there seems to be missing functionality to identify the user agent string (something called “raw request”) but get an error message: Personal data in the raw request has been replaced with tag. Even if I had that info, I’m not sure it would stop Google because it’s presenting itself as follows:
Devices & platform
Chrome 117.0, Windows 10, Desktop, 800x600(Edited)