Discrepancy between Custom Report and CSV Export

Hey, I wanted to export some custom reports for other stakeholders to work with Google Sheets. When pivoting data and aggregating metrics, the numbers are different.

What’s the reason for this? Is something that can be mitigated somehow?

Thanks

@kuba @Piotrek @piotr hey guys, I need to tackle this asap but need confirmation. Could you help me out? Thanks

There’s an approximation algorithm in place for some metrics. Check this article out. Let us know if it’s higher than expected.

Thanks @piotr for helping me out with this

Here is one example for a specific Country so far this month. The final summary should be 2217, the aprox algo works inside and outside the UI?

ui_example

The main issue with the export right now is the following:

3204 visitors for that period and those dimensions

When I export it, I have way more than 2% aproximation algorithm. This is just one example, but most of the tests that I’ve been doing are around 8-10% more data in sheets.

@piotr do you guys have any extra input for this? Thanks

@Alejandro_Aboy Hi, could you share with me a link to your Piwik PRO instance (in PM or here). I have a hypothesis that I need to confirm.

1 Like

Hey @Alejandro_Aboy so:

  1. The numbers from CSV file are the same as in the custom reports rows. It’s the calculated total in Piwik PRO UI that is different.
  2. Your custom report that you are referring to “look” at each day and is counting visitors for that specific day.
    For example. you have 3 visitors on Monday, 2 visitors on Tuesday and 3 visitors on Wednesday (but those 3 are the same people that visited on Monday). When you count each day separately you will have 3+2+3=8 visits (and that’s how you count it when you add all numbers in the CSV export). But when you look for a date range within Piwik PRO from Monday to Wednesday it will show you 5 visitors because those from Wednesday and Monday are the same and we won’t count them → What are visitors and how are they counted? | Piwik PRO help center
  3. This explains why there is a very small difference when we look at the 1-day range (it’s within those 2% of approx algorithm) but a large difference when we look at longer date ranges. With each day this discrepancies will be bigger.
  4. Please note that the total number of sessions is the same in the Piwik PRO UI and when you add together all the rows with session count.
    The difference in the visitors is the effect of how we count visitors within a specified date range

To make it as short as possible - in Piwik PRO UI you will see a number of unique visitors for that range. When you add all the rows in CSV file you will see a number of all visitors, those who come back will be counted twice or more.

Thanks a lot Michal! The best alternative approach would be calling the API definition of a “VisitorID - date - country - channel - visitors - sessions” table and get the unique visitors right?

Visitor ID for the returning clients would be the same as during the first visit so it could help you count unique visitors :slight_smile:

1 Like