Hi,
in some accounts, I see a significant percentage of visitors using ISPs like “Google Cloud”, “Microsoft Corporation” or “Amazon.com”. For the most part you can be sure this is non-human traffic. And especially in “consent-free” setups, collecting hits from bots is quite possible. Additionally, browsers include things like “Headless Chrome”. Even “PhantomJS” might occur here amongst many other suspicious entries.
While blocking tracking for suspicious User Agents directly in the browser is possible and even can be configured in the site settings, detecting bots by processing the IP information would require quite some effort and resources. Would be nice if there was any way of filtering data before hits by this end up in reports. An exclusion list for ISPs just like the existing one for IP addresses might be a solution?
In the meantime I might have to use the option to exclude user agents quite extensive. A question regarding this feature: Are agents like „Chrome-Lighthouse“ useless entries here? Because I assume this is one of the agents optionally blocked by using the option „Don´s collect data from known crawlers“. Is there a public list of crawlers or user agents that are covered by this option?
Additionally I wonder why I see „Headless Chrome“ in the Browser dimension in reports, while the User Agent string would contain „HeadlessChrome“ without a bank. If I want to exclude those hits: which version would be the correct one to enter in the settings? I decided to do both just to be sure
By the way: Do hits from filtered bots / crawlers still count when calculating hit budgets?
best,
Markus