Access Google Analytics API in Python

Get your Google Analytics data, and build your own graph charts in Python with Authlib.

Google has provided an official library to fetch data from Google Analytics. You can always use the official library if you want, just follow the official guide. But if you want to use Authlib, or if you want to figure out what's going on behind those libraries, you should read this post.

Using Authlib to access Google Analytics API is basically using the requests for human.

What is Google Analytics API

Basically, it is an OAuth POST request to the v4 reporting API:

POST /v4/reports:batchGet HTTP/1.1
Host: analyticsreporting.googleapis.com
Content-Type: application/json
Authorization: Bearer string-of-token
{
"reportRequests":
[
{
"viewId": "XXXX",
"dateRanges": [{"startDate": "2014-11-01", "endDate": "2014-11-30"}],
"metrics": [{"expression": "ga:users"}]
}
]
}

The main trouble is to get the OAuth token which can be solved by the official python client. You can also use Authlib to get the OAuth access token, which would be better to understand.

How to Get Token

Authlib has just released version 0.7. In this version, Authlib has provided a AssertionSession which is a client implementation of RFC7523. That has been said, Google's so called service account is actually JWT for Authorization Grants. You can get a bearer token with grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer and a JWT assertion.

POST /token.oauth2 HTTP/1.1
Host: authz.example.net
Content-Type: application/x-www-form-urlencoded
grant_type=urn%3Aietf%3Aparams%3Aoauth%3Agrant-type%3Ajwt-bearer
&assertion=eyJhbGciOiJFUzI1NiIsImtpZCI6IjE2In0.eyJpc3Mi[...omitted for brevity...]

And this is what AssertionSession is doing. It will get a valid OAuth token automatically, prepare a requests session for you to use. Yes, it is just a requests session.

Build an Example

Let's take a look at what we are going to build. This is a Google Analytics charts in Typlog dashboard.

Typlog Analytics

First, we need to get a Google service account, which can be created on Service accounts page. Then, we can use the provided JSON file to create a AssertionSession.

import json
from authlib.integrations.requests_client import AssertionSession
def create_assertion_session(conf_file, scope, subject=None):
with open(conf_file, 'r') as f:
conf = json.load(f)
token_endpoint = conf['token_uri']
issuer = conf['client_email']
key = conf['private_key']
key_id = conf.get('private_key_id')
header = {'alg': 'RS256'}
if key_id:
header['kid'] = key_id
# Google puts scope in payload
claims = {'scope': scope}
return AssertionSession(
token_endpoint=token_endpoint,
issuer=issuer,
claims=claims,
subject=subject,
key=key,
header=header,
)
session = create_assertion_session('your-google-conf.json', 'https://www.googleapis.com/auth/analytics.readonly')

This session is a requests session, which has all the requests methods, like get, post, put, etc. The next thing is to create the POST payload, which would be:

report = {
'viewId': 'XXX',
'dateRanges': [
{'startDate': '2018-04-01', 'endDate': '2018-05-01'},
],
'metrics': [
{'expression': 'ga:pageviews'},
{'expression': 'ga:sessions'},
{'expression': 'ga:users'},
],
'dimensions': [
{'name': 'ga:date'}
],
}
BATCH_GET_URL = 'https://analyticsreporting.googleapis.com/v4/reports:batchGet'
resp = session.post(BATCH_GET_URL, json={'reportRequests': [report]})
print(resp.json())

You can return the JSON response to the browsers, and the last thing is to build a chart in JS with the response JSON. I'm using chart.js, you can also use other libraries.

Build reportRequests

There is a report in the above section which is the payload to send to Google Analytics API. But how to create such a payload? You can learn it from the official documentation. I will show you some examples of the requests in Typlog.

A single post analytics data

post_filter = {
'dimensionName': 'ga:pagePath',
'operator': 'EXACT',
'expressions': post_page_path
}
site_filter = {
'dimensionName': 'ga:dimension1',
'operator': 'EXACT',
'expressions': site_id
}
base_report = {
'viewId': view_id,
'dateRanges': [
{'startDate': start, 'endDate': end},
],
'dimensionFilterClauses': {
'filters': [site_filter, post_filter],
'operator': 'AND'
}
}
visit_report = {
'metrics': [
{'expression': 'ga:pageviews'},
{'expression': 'ga:sessions'},
],
'dimensions': [
{'name': 'ga:date'}
]
}
referrer_report = {
'metrics': [
{'expression': 'ga:pageviews'},
{'expression': 'ga:sessions'},
],
'dimensions': [
{'name': 'ga:fullReferrer'}
]
}
visit_report.update(base_report)
referrer_report.update(base_report)
reports = [visit_report, referrer_report]
resp = session.post(BATCH_GET_URL, json={'reportRequests': reports})

In this example, it has two reports, visit_report is used to create a chart of visits information, and referrer_report is used to create a table of referrer information.


There is a ready to use GoogleServiceAccount implementation in loginpass.

Comment at Reddit