Introduction
As the off-premise storage solutions grow, the need to write back to cloud systems is becoming more and more popular. In this example, we are going to write back to Google Big Query from Spotfire using a Python Data Function. We would need to set up the authentication part on the Google Cloud Platform for Spotfire to be able to write back. The same process goes when writing back to any other Google Cloud Platform system such as Google Drive.
BigQuery is a fully managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence. BigQuery is a REST-based web service which allows you to run complex analytical SQL-based queries under large sets of data.
Setup Google Cloud Platform Authentication
The authentication process setup is very similar to the Write back to Google Sheets from TIBCO Spotfire using Python Data Function. To set up the Google Cloud Platform Authentication, we need a project and a service account that can perform calls to the Google Big Query API. Making calls to the Google Big Query API is a key difference from writing back to Google Sheets.
Create a new project
Create a new project if not already done so. A Google Cloud Platform Project consists of a set of users, APIs, billing, authentication, resources, and more.
Go to console.cloud.google.com, sign in with your Google account, and create a new project by clicking the top dropdown.
Enable Google Sheets and Google Drive API
An API is an Application Programming Interface. Google Cloud Platform has many of these, so we need to enable for this case the Google Big Query API.
From the main menu on the left, choose APIs & Services, then select Enabled APIs & services and search for Google Sheets Big Query APIs
Setup a service account
A service account is a special type of account intended to represent a non-human user that needs to authenticate and be authorized to access Google APIs. Go to IAM & Admin to set up a service account. If not already done so, create a service account that will be working with this API through the + create credentials shortcut. You can always go to IAM & Admin - Service Accounts from the main menu and create one. Be careful about the role given for this account. The Owner role allows access to most Google Cloud resources.
Create authentication keys
Google Cloud APIs use the OAuth 2.0 protocol for authenticating both user accounts and service accounts. The OAuth 2.0 authentication process determines both the principal and the application. Most Google Cloud APIs also support anonymous access to public data using a GCP service account key. This key is used by the applications to make authorized API calls but in this exercise, we will be using authentication keys.
Form IAM & Admin - service Accounts, click on the service account you created and go to the keys tab, or go directly to Manage Keys from the Service account menu. Create a new JSON key to download a json file containing the credentials. We are going to need the contents of this file later. Keep this service account passwords ivate, personal, and in a safe place!
A service account JSON key file looks like this:
{ "type": "service_account", "project_id": "spotfire-cloud-writeback", "private_key_id": "jfielwssasdf99asdf02345d03453459dlldflsd", "private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQC7sd0q8HFh+I2T\nfpEN0pTMVRyTGWteZ/cSZR0uhMwIEUarlNWown9Y3QjLgxOYCkVGMO3726ewm3gm\nWxu4RZdDP984p6WQAsgPKbed+X3ahmrbmtxRBAEh5JZJgTeOl8wP/GG3/jsyEAvL\nr1WTO3Sk/ysRraOYq0KkMvI0VKTbOayPHuJ4QEe4AKlwTqZdYYtPU7Eq3+h+XNgY\nhB3L51mVe5f15/l/yfeS2YaAMiQqHypTDC6fvXPSkCBrv2A\nSssy0VNATwKBgEg5KZysUDKPinYMS950HBdqAaCB2WPd64Dh97hu05O0EUuqI0W4\nmwYhTJOaDqcmYiaqWc6d65nkvs/fjgquf9EmI+GTusT4uB/pM3XMKNQR6edNfNHE\nDT07cq3JUhNyMK+j2oXr+BLscqBhMK7vxUWNztbETDEDteorK3Y9dLAtAoGAauHh\n45CQNspuDixTiL4s3eI29yWOw03MNVFRA9sqktRh/Rwn4boCpb3DHkbg9gTzVVno\n8gv6KPqc35KrHsYxN4sV3oQuwvYhjTsdQp3wq3+qSGdMdqIC5Lu1rWpEzq94sm+W\n8VAzS6v1jXhwgjY2Btrs9Q9AG27s9klx6IAyuZUCgYB7GYPp08b+86wDoPggTNRi\nweE5eylpc+YBLFPZ0DCErjMyOctFFrHk8l+8T8OQzgpXO3udw3NYRH4qeYd3+ZN3\nSnuvaXjNzdht2olixeqqcadHX5CRngkmpKIT/EexkI+uLhvVgbAv/te+dklAbaSI\n5DxQKriHclJOAlui0XyNEg==\n-----END PRIVATE KEY-----\n", "client_email": "example@spotfire-cloud-writeback.iam.gserviceaccount.com", "client_id": "112233445566778899001", "auth_uri": "https://accounts.google.com/o/oauth2/auth"", "token_uri": "https://oauth2.googleapis.com/token"", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs"", "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/cacaca%40spotfire-cloud-writeback.iam.gserviceaccount.com"" }
Setup billing
There is a free tier for Google Big Query and it requires billing to be setup. The first 10GB of data per month are free.
Create a Dataset and Table
Datasets are top-level containers that are used to organize and control access to your tables and views. A table or view must belong to a dataset, so you need to create at least one dataset before loading data into BigQuery.
Create a Dataset
From GCP Main menu on the left, select Google Big Query and create a new data set from the project you created
Create a dataset table
Create a a table for the dataset by clicking on the dataset contextual menu. Select the project and give the table a name. Leave the table type as native and add 3 columns to the table by editing its schema.
Share dataset with Service account
All is left from the Google side is to share the google sheet to the service account email. The service account email can be found on the Service Account JSON Key file. This Service Account looks like this: example@spotfire-cloud-writeback.iam.gserviceaccount.com
Configure Spotfire to Write back to Google Big Query
Now that we have the authentication and a place to store our data in Google Big Query, the final step is to create a python Data Function to send data back to our Google Big Query table.
Install python packages
The required packages for the data function to work is oath2client and bigquery. oath2client is a client library for OAuth2 delegates user authentication to the service that hosts a user account, allowing Spotfire to use the Google Sheets API. The bigquery python package is a simple interface for working with Google Big Query
These packages can be installed using Python Tools from TIBCO Spotfire client tools menu.
Register a new data function
Data functions are the Spotfire way of letting advanced analysts, statisticians or mathematicians enhance Spotfire by creating scripts that can perform pretty much any type of calculation and returning the results to a Spotfire analysis
In TIBCO Spotfire, on the menu bar, select Tools > Register data functions, select Python script and paste the code below to the script tab. Run the data funciton and check if the data function does not throw any errors. Check for errors with the notification icon. If there is some trouble with the loaded data, with a script or data function, or with a calculation in the analysis, a warning indication is displayed on the notifications icon on the menu bar.
Edit Data Function
The rest of the script performs authentication and inserts a row of data. Many other operation can be done with different methods but to keep things simple, the code has the minimal number of lines of code required to write back to Google sheets.
Do not continue to edit the code but rather close the Regiter Data Function dialog. Click No to embed the data function in the analysis. You can always save it to the library later.
From Data > Data function properties select the script and edit by pasting the rest of the code. Just replace the keys variable with the content of the JSON key file
from google.cloud import bigquery from google.oauth2 import service_account #authentication stuff keys = { "type": "service_account", "project_id": "spotfire-cloud-writeback", "private_key_id": "463a3246b97bd9f364d9efb7d334df3a9fbf195d", "private_key": "-----BEGIN PRIVATE KEY-----\neA..........\n", "client_email": "pythongsheets@spotfire-cloud-writeback.iam.gserviceaccount.com", "client_id": "112223334444555556789", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/pythongsheets%40spotfire-cloud-writeback.iam.gserviceaccount.com" } scope=["https://www.googleapis.com/auth/cloud-platform"] credentials = service_account.Credentials.from_service_account_info(keys) #connection client = bigquery.Client(credentials=credentials, project=credentials.project_id,) table_id = "spotfire-cloud-writeback.spotfire_dataset.spotfire_table" #"your-project.your_dataset.your_table" #prepare rows to insert (script parameters) import datetime tstamp = str(datetime.datetime.now()) #comment = "Hello GBQ!" #val = 5 rows_to_insert = [ {u"date":tstamp,u"comment": comment, u"value": val}, ] #insert rows errors = client.insert_rows_json( # table_id, rows_to_insert, row_ids=[None] * len(rows_to_insert) table_id, rows_to_insert ) #warn for errors if errors == []: print("New rows have been added ☺" else: print("☻ Encountered errors while inserting rows: {}".format(errors))
Configure Data Function parameters
The script does not return any values at this point but input is required. The comment
and val variables can be mapped to document properties
Read back from Google Big Query
If you do not have a shared connection that includes the BigQuery data that you want to analyze, you can create a new connection. Then you can make your own data selection from the BigQuery data you have access to. Use the .+ flyout and click Connect to and select Google Big Query. Press F1 for detailed instructions.
See also
- Writing to files and databases from Spotfire
- Write back to Google Sheets from TIBCO Spotfire using Python Data Function
References
Recommended Comments
There are no comments to display.