Piwik attribution: event sent from BQ database always Direct/Direct

Hi everyone,

We are sending an event from our BQ database, which is our main goal: when a trial becomes a paying customer = subscribes.

Mission: to get the first touch source/medium (and MRR) of our paying accounts. So we have an activation goal: trials or signups and we want to understand its acquisition channels. Then once the signup has subscribed to paying customer we’d like to see which acquisition channels work best for that precious moment, which is eventually the one that matters (a source/medium might bring more signups than other, but eventually the other convert better into paying customers).

Insightful metrics: What was the first touch source/medium of a funnel, ending with trial and/or subscription (hooray)? (and not last touch attribution model, we wanted to see first touch or even time decay or linear)

Note: The signup happens on a landing page different to the commercial domain. So people come from a given source/medium to a commercial site and from there clicking on different CTAs they land on the signup form page, on a different domain. Then a trial might become a paying customer inside the app or not (subscription), all that logged in BQ.

Issue: However since day 1, the app subscription event is always showing Direct/direct no matter what.

The subscription to our app usually happens outside of any session so is Piwik simply considering that the source/medium for that session is Direct/Direct then in the reports?

Anyone with a SaaS using Piwik in a similar situation and that got it solved OR can hint us how to do it successfully? Because this is really making us consider whether to stay on Piwik or change since these metrics are key to successful marketing decisions :frowning:

Thanks as always

The subscription to our app usually happens outside of any session so is Piwik simply considering that the source/medium for that session is Direct/Direct then in the reports?

Correct, the event sent from BQ, for the web analytics tool is a new visitor as it doesn’t have any cookie or Visitor ID attached (there’s a workaround described below). It’s quite natural that the source/medium is considered as direct since the request doesn’t have any referrer nor utm tags.

To have the data you need in the attribution report, you would need to find a way to retrieve a visitor ID from past sessions (ex. via raw data API) and pass it to the _id parameter (docs) of the request sent from BigQuery.

The foundation for the attribution report is having user paths that consist of many sessions with the same visitor ID. (Side note: nowadays in a world of solutions like ITP it becomes more challenging.)

Hi @Jarek ,

My name is Yves, and I am Elena’s colleague.

What you describe is exactly what we did. So here is a typical case and maybe you can highlight what might be wrong or if there is a limitation in how Piwik behaves.

Here is the full technical implementation.

We have a commercial domain: https://beebole.com and an app domain https://beebole-apps.com

The Piwik JS is installed on the commercial domain and on the Signup/Signin pages of the app domain.

All events on the commercial website are tracked using the Piwik JS, then when someone decides to sign up for a trial, the Visitor ID is retrieved using this method:

And then sent with a URL decoration:

To make sure the visitor ID is correctly passed through the two domains.

On the Signup page, once the signup form is completed, the Visitor ID is retrieved from the Piwik cookie and sent to our back end to identify the Visitor ID with the newly created User ID.

We store then the signup event with both IDs and that event is then sent as a backend event to Piwik via the API.

It usually works well, with the limitation of 3rd party cookies, …

Here is an example:

The source and medium is there, the back-end event appears correctly, …

Now, the trial lasts for usually 30 days.

If an account becomes a customer, it triggers another event, with is then also stored in an event BQ table along with the User ID, but also with the first instance of the Visitor ID we have for that user. (Typically the visitor ID that was there when the person signed up.

Here is an example of that table where you can see the two events, with the same user ID and visitor ID:

The become_customer event is then also sent via the API to Piwik.

But if I look for the example above, here is what I see for the signup and for the subscribed events:

The Signup attribution report will correctly identify the source/medium.

The subscribed will just show as direct/direct.

Despite the visitor ID being sent for the subscribed event and being the same as the signup event.

So, is there something to change?


Ping @Jarek, so you get a notification

Both events, sign up and subscribe are sent the same way but in different time windows which is crucial.

Sign up is sent while the visitor session started on the web is opened so the event sent from the backend is attached to the session and it has a session’s source/medium.

Subscription event is sent many days later and creates a new session. It has direct source/medium as the event itself doesn’t include any information that could be used to classify the traffic otherwise.

The following alternative options come to my mind for how to tackle that:

  1. Besides visitor ID and User ID, also store in DB source and medium from the “signup session”. When firing the subscription event, add them to UTM tags. You have to use url parameter and that URL should include utm_source and utm_medium. As an outcome, you should have the session with the subscription event with the same source/medium as the session with the signup event.
    Please note that I haven’t tested that out so treat it as a direction rather than ready to go solution.

  2. Setup a goal conversion for the subscription event and use the attribution report. It will show you user paths leading to a conversion so even though the last session is a direct entry, you will be able to see source/medium for the past sessions.
    Remember to set the lookback window higher than 30 days to get sessions you need.

Additional remark that can be relevant to your setup and especially option 2: once the User ID is tracked it will cause a reset of the Visitor ID:

  1. Visitor enters the page, visitor_id=abc
  2. Visitor signs up, User ID is assigned, user_id=123 and visitor_id=def (previously set visitor_id is overwritten).
    When using getVisitorID() function, you won’t see the change, it can be seen in the Tracker Debugger.

It is an issue if the visitor had many sessions before signing up and the user ID wasn’t provided. He will have a different Visitor ID in those sessions than in a session in which the User ID was provided.

Hi @Jarek,

Well, the Option 2 is what I also tried in the past.

The goal was created months ago, but when I run an attribution report on the Subscribed goal, the source/medium always remains direct/direct.

Unless I am using it the wrong way…

the source/medium always remains direct/direct

What step in user path are you referring to? If the last one, then it always will be direct. It’s expected that the first-touch session will have a source/medium other than direct.
Did you set the lookback window to greater than 30 days?

just chirping in @Jarek that I had actually checked -if not mistaken when following your steps- and still direct.


I have the feeling that your additional remark about visitor ID being changed when identified as user_id is the key.

I will now force the Visitor ID in the “subscribed” event to the first session value in the hope it will bring some “glue” to these and allow Piwik to link them to the initial source/medium.

The API call was done just using the user_id and indeed, when I look at the raw data, the visitor ID seems to change, but the user_id is not enough for piwik to make the link as I understand.


Correct, the attribution report calculates user paths based on the Visitor ID.