Quick Start Guide for Querying Google Analytics 4 Data API (Beta) with Python

Richard DennisApril 14th, 2021

Last year our generous corporate overlord Google graced us with a completely new Analytics platform. GA4.

This comes with a lot of new and missing features making it very different to the now unsupported and outdated Universal Analytics, which one day will take its place in the dense Google graveyard along with Google+ (good riddance), Google Play Music (goodnight sweet prince), and Hangouts (meh).

In the month of March 2021 they’ve thrown us a bone, the Google Analytics Data API, which now allows developers and the like to programmatically pull reporting data directly from GA4 using Java, JS, Python, & .NET. I’ll be showing you how to dig this out dead quick just using Python. Obviously Google has documentation on this themselves, but if you’ve never used these docs before, or want to save yourself the trouble, I’ll be laying it out plain and simple here.

Big caveat, since this API is right at the starting line as I write this, things are subject to change as it develops, most likely the way the python modules are imported, but it could be anything.

Project Setup

If you haven’t used Python before, or are just starting out, we'll be using python3, pip, and virtualenv. If those words mean very little, I suggest starting with ‘Step 1.’ of my Universal Analytics API post. I also have a video tutorial.

In terms of how I work, I use VScode for writing the script, with powershell to activate my environment and run scripts. If you use a dedicated python IDE like Spyder or Anaconda, these will work too, and I’m going to just assume you know how to set the project and environment up there.

I created a project folder called ‘ga4_api_project’ in my development folder, and alongside this a ‘venv’ folder for my environment. Take a look.

ga4_project_folder

Now open up your development folder in your command line, navigate to ‘venv’ and create a new virtual environment, I called mine ‘ga4_env’, best not to use an existing environment for this which has Universal Analytics API modules in it, it might break:

Dev> cd .\venv\

venv> virtualenv ga4_env

Then I entered my new environment using the ‘activate’ function in it’s ‘Scripts’ folder.

venv> .\ga4_env\Scripts\activate

You should have a ‘(ga4_env)’ next to your directory now. This means we can install the new Google Analytics Data API module so our script can use its functions. Use the following command to do it.

venv(ga4_env) > python -m pip install google-analytics-data

We can navigate to the project directory now the environment is all ready to go. Leave this command line open, we will need it later.

venv(ga4_env) > cd ..

Dev(ga4_env) > cd .\ga4_api_project\

API Access & Cloud Credentials

To give our script the ability to use our GA4 data, we have to enable the API in a Google Cloud project, then give that project access to our GA4 property, the same way you would a standard user.

New Google Cloud Project

Using the API’s & Services > Library in the left navigation bar, search for and enable the ‘Google Analytics Data API’.

analytics api enable

Now with API & Services > Credentials, create a new service account and download the JSON file when prompted, this acts as a key for our script.

JSON Key Creator Screen

Put this new JSON file in your recently created project folder, and rename it ‘client_secrets.json’. This contains a ‘client-email’ variable, you can also find this in the Service Accounts section of Google Cloud, it’ll end with ‘@appspot.gserviceaccount.com’.

In your Google Analytics 4 account, go to Admin > Property User Management, and add your ‘client-email’ as a User with at least read level permissions. You’ll also want to take a copy of your ‘Property ID’ from the Admin > Property Settings.

Script Authentication

It’s programming time, all though half of programming is fiddling within platforms, this is the actual fun part, building and running the script. Well start by adding a new python file to our project called ‘main.py’, if you’re using VScode, you can do this with a command.

ga4_api_project(ga4_env) > code main.py

Now Visual Studio is open, we’ll start adding code. Due to this API being in development, this authentication part will likely change, to get the most up to date module directories to import the functions, go to Google’s own quickstart guide here. As of writing this, the top of your file will need the following imports.

from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import DateRange
from google.analytics.data_v1beta.types import Dimension
from google.analytics.data_v1beta.types import Metric
from google.analytics.data_v1beta.types import RunReportRequest
import os

We imported os because we need to set an environmental variable for our client, this is needed to show it where our credentials JSON file is. Set this variable, and begin the function using the following code.

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'client_secrets.json'

def sample_run_report(property_id="YOUR-GA4-PROPERTY-ID"):
  """Runs a simple report on a Google Analytics 4 property."""
  client = BetaAnalyticsDataClient()

Replace the property_id string with your own. Now we have the above in place, we can begin using the functions in the new API, and request our data. Well be doing this with a basic RunReportRequest, this needs an array of dimensions and metrics we want to pull, as well as a date range to get them for.

  request = RunReportRequest(
    property=f"properties/{property_id}",
    dimensions=[Dimension(name="city")],
    metrics=[Metric(name="activeUsers")],
    date_ranges=[DateRange(start_date="2020-03-31", end_date="today")],
    )

  response = client.run_report(request)    
  print("Report result:")
  for row in response.rows:
    print(row.dimension_values[0].value, row.metric_values[0].value)

if __name__ == "__main__":
    sample_run_report()

We now have something that requests the number of active Users by City, since the end of March to today. You’re free to change these dimensions, metrics, and date ranges as you see fit, you can find the schema for these here. The final if statement is a standard part of running a python function after it has been evaluated, without this the report won’t run.

The Finale

Here’s the whole script for you to copy and paste in one go.

from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import DateRange
from google.analytics.data_v1beta.types import Dimension
from google.analytics.data_v1beta.types import Metric
from google.analytics.data_v1beta.types import RunReportRequest
import os

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'client_secrets.json'

def sample_run_report(property_id="YOUR-GA4-PROPERTY-ID"):
  """Runs a simple report on a Google Analytics 4 property."""
  client = BetaAnalyticsDataClient()

  request = RunReportRequest(
    property=f"properties/{property_id}",
    dimensions=[Dimension(name="city")],
    metrics=[Metric(name="activeUsers")],
    date_ranges=[DateRange(start_date="2020-03-31", end_date="today")],
    )

  response = client.run_report(request)

  print("Report result:")
  for row in response.rows:
    print(row.dimension_values[0].value, row.metric_values[0].value)

if __name__ == "__main__":
  sample_run_report()

Once you save your main.py file, you can now run this in the environment you have open like so.

ga4_api_project(ga4_env) > python main.py

This should result in your dimensions and values printed in a list, if you have some kind of error, I’d recommend going back through these steps and making sure you’ve done everything to the letter. The most common issue I’ve encountered is with authentication, make sure you’re using the most up to date version of the API module, you can do this by adding ‘--update’ to the end of the pip install command above. I would also make sure your JSON key file is in the right place, and the service account it is linked to has the right GA4 access.

Once you’re getting data, you can start building on this script, maybe you need it as a CSV, in which case I would recommend using the pandas pip module to turn this report into a dataframe, it will be much easier to transform and save. Alternatively, if you are really struggling, you could opt for a solution like Supermetrics, which now has a GA4 connector for Google Sheets.

Thank you so much for getting this far, and congrats if you got something out of this. Leave a comment below if you have any questions, you can also shoot me a message on my linkedin, or email richard@bedrock42.com. If you want to get updates on our latest posts, subscribe to our newsletter!