Jump to content
  • Write back to Google Big Query from Spotfire using Python Data Function


    Introduction

    As the off-premise storage solutions grow, the need to write back to cloud systems is becoming more and more popular. In this example, we are going to write back to Google Big Query from Spotfire using a Python Data Function. We would need to set up the authentication part on the Google Cloud Platform for Spotfire to be able to write back. The same process goes when writing back to any other Google Cloud Platform system such as Google Drive.

    BigQuery is a fully managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence. BigQuery is a REST-based web service which allows you to run complex analytical SQL-based queries under large sets of data.

     

    ezgif.com-gif-maker(5).gif.fcf2877b80db8f01fca2c1aa04faaf46.gif

     

    Setup Google Cloud Platform Authentication

    The authentication process setup is very similar to the Write back to Google Sheets from TIBCO Spotfire using Python Data Function. To set up the Google Cloud Platform Authentication, we need a project and a service account that can perform calls to the Google Big Query API. Making calls to the Google Big Query API is a key difference from writing back to Google Sheets. 

    Create a new project

    Create a new project if not already done so. A Google Cloud Platform Project consists of a set of users, APIs, billing, authentication, resources, and more. 

    Go to console.cloud.google.com, sign in with your Google account, and create a new project by clicking the top dropdown.

    newproject_0.png.32bb13e87c5506adb584d8f40bbd0262.png

     

    2newproject.png.e726a481dc324c34de9e3d6e6a303439.png

    Enable Google Sheets and Google Drive API

    An API is an Application Programming Interface. Google Cloud Platform has many of these, so we need to enable for this case the Google Big Query API.

    From the main menu on the left, choose APIs & Services, then select Enabled APIs & services and search for Google Sheets Big Query APIs

    gbqapi_0.png.095911e8954dcabb871e7a4d72588bdd.png

    Setup a service account

    A service account is a special type of account intended to represent a non-human user that needs to authenticate and be authorized to access Google APIs. Go to IAM & Admin to set up a service account. If not already done so, create a service account that will be working with this API through the + create credentials shortcut. You can always go to IAM & Admin - Service Accounts from the main menu and create one. Be careful about the role given for this account. The Owner role allows access to most Google Cloud resources. 

    auth.thumb.png.5aaccf68ffd67ef9156970a45bfc8f6a.png

    auth2_0.png.f2d5947c416584b0e08ae890dd50cdad.png

    Create authentication keys

    Google Cloud APIs use the OAuth 2.0 protocol for authenticating both user accounts and service accounts. The OAuth 2.0 authentication process determines both the principal and the application. Most Google Cloud APIs also support anonymous access to public data using a GCP service account key. This key is used by the applications to make authorized API calls but in this exercise, we will be using authentication keys. 

    Form IAM & Admin - service Accounts, click on the service account you created and go to the keys tab, or go directly to Manage Keys from the Service account menu.  Create a new JSON key to download a json file containing the credentials. We are going to need the contents of this file later. Keep this service account passwords ivate, personal, and in a safe place!

    sak_1.thumb.png.f8086cb5da2bc5ab0967c1fce6fbe1e6.png

    A service account JSON key file looks like this:

    {
      "type": "service_account",
      "project_id": "spotfire-cloud-writeback",
      "private_key_id": "jfielwssasdf99asdf02345d03453459dlldflsd",
      "private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQC7sd0q8HFh+I2T\nfpEN0pTMVRyTGWteZ/cSZR0uhMwIEUarlNWown9Y3QjLgxOYCkVGMO3726ewm3gm\nWxu4RZdDP984p6WQAsgPKbed+X3ahmrbmtxRBAEh5JZJgTeOl8wP/GG3/jsyEAvL\nr1WTO3Sk/ysRraOYq0KkMvI0VKTbOayPHuJ4QEe4AKlwTqZdYYtPU7Eq3+h+XNgY\nhB3L51mVe5f15/l/yfeS2YaAMiQqHypTDC6fvXPSkCBrv2A\nSssy0VNATwKBgEg5KZysUDKPinYMS950HBdqAaCB2WPd64Dh97hu05O0EUuqI0W4\nmwYhTJOaDqcmYiaqWc6d65nkvs/fjgquf9EmI+GTusT4uB/pM3XMKNQR6edNfNHE\nDT07cq3JUhNyMK+j2oXr+BLscqBhMK7vxUWNztbETDEDteorK3Y9dLAtAoGAauHh\n45CQNspuDixTiL4s3eI29yWOw03MNVFRA9sqktRh/Rwn4boCpb3DHkbg9gTzVVno\n8gv6KPqc35KrHsYxN4sV3oQuwvYhjTsdQp3wq3+qSGdMdqIC5Lu1rWpEzq94sm+W\n8VAzS6v1jXhwgjY2Btrs9Q9AG27s9klx6IAyuZUCgYB7GYPp08b+86wDoPggTNRi\nweE5eylpc+YBLFPZ0DCErjMyOctFFrHk8l+8T8OQzgpXO3udw3NYRH4qeYd3+ZN3\nSnuvaXjNzdht2olixeqqcadHX5CRngkmpKIT/EexkI+uLhvVgbAv/te+dklAbaSI\n5DxQKriHclJOAlui0XyNEg==\n-----END PRIVATE KEY-----\n",
      "client_email": "example@spotfire-cloud-writeback.iam.gserviceaccount.com",
      "client_id": "112233445566778899001",
      "auth_uri": "https://accounts.google.com/o/oauth2/auth"",
      "token_uri": "https://oauth2.googleapis.com/token"",
      "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs"",
      "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/cacaca%40spotfire-cloud-writeback.iam.gserviceaccount.com""
    }
     

    Setup billing

    There is a free tier for Google Big Query and it requires billing to be setup. The first 10GB of data per month are free.

    Create a Dataset and Table 

    Datasets are top-level containers that are used to organize and control access to your tables and views. A table or view must belong to a dataset, so you need to create at least one dataset before loading data into BigQuery.

    Create a Dataset

    From GCP Main menu on the left, select Google Big Query and create a new data set from the project you created

    dataset_0.png.734625661b3df322916be45830054f25.png

     

    dataset2.png.629ac0564e07a29a5ec3cd56eb3ac1ac.png

    Create a dataset table

    Create a a table for the dataset by clicking on the dataset contextual menu. Select the project and give the table a name. Leave the table type as native and add 3 columns to the table by editing its schema.

    table_8.thumb.png.1dc166476831c6303f3396696ce4f5ec.png

    Share dataset with Service account

    All is left from the Google side is to share the google sheet to the service account email. The service account email can be found on the Service Account JSON Key file. This Service Account looks like this: example@spotfire-cloud-writeback.iam.gserviceaccount.com

     

    share_3.png.1ee1940075bb84fd40b3ad91add82435.png

    Configure Spotfire to Write back to Google Big Query

    Now that we have the authentication and a place to store our data in Google Big Query, the final step is to create a python Data Function to send data back to our Google Big Query table.  

    Install python packages

    The required packages for the data function to work is oath2client and bigquery. oath2client is a client library for OAuth2 delegates user authentication to the service that hosts a user account, allowing Spotfire to use the Google Sheets API. The bigquery python package is a simple interface for working with Google Big Query

    These packages can be installed using Python Tools from TIBCO Spotfire client tools menu.

    packages_0.thumb.png.9ec0fe477d91f721e7398ff7c4aefeca.png

    Register a new data function

    Data functions are the Spotfire way of letting advanced analysts, statisticians or mathematicians enhance Spotfire by creating scripts that can perform pretty much any type of calculation and returning the results to a Spotfire analysis

    In TIBCO Spotfire, on the menu bar, select Tools > Register data functions, select Python script and paste the code below to the script tab. Run the data funciton and check if the data function does not throw any errors. Check for errors with the notification icon. If there is some trouble with the loaded data, with a script or data function, or with a calculation in the analysis, a warning indication is displayed on the notifications icon on the menu bar.

    datafunction.png.91fce9ea89a4a7734a4e238bddc89808.png

    Edit Data Function

    The rest of the script performs authentication and inserts a row of data. Many other operation can be done with different methods but to keep things simple, the code has the minimal number of lines of code required to write back to Google sheets.

    Do not continue to edit the code but rather close the Regiter Data Function dialog. Click No to embed the data function in the analysis. You can always save it to the library later. 

    embed.png.eae8202dd386def99cf88281c83756be.png

    From Data > Data function properties select the script and edit by pasting the rest of the code. Just replace the keys variable with the content of the JSON key file

    code.png.9ebed6b6c8cffe6b96ed41cacc279356.png

     from google.cloud import bigquery from google.oauth2 import service_account  #authentication stuff keys = {   "type": "service_account",   "project_id": "spotfire-cloud-writeback",   "private_key_id": "463a3246b97bd9f364d9efb7d334df3a9fbf195d",   "private_key": "-----BEGIN PRIVATE KEY-----\neA..........\n",   "client_email": "pythongsheets@spotfire-cloud-writeback.iam.gserviceaccount.com",   "client_id": "112223334444555556789",   "auth_uri": "https://accounts.google.com/o/oauth2/auth",   "token_uri": "https://oauth2.googleapis.com/token",   "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",   "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/pythongsheets%40spotfire-cloud-writeback.iam.gserviceaccount.com" } scope=["https://www.googleapis.com/auth/cloud-platform"]    credentials = service_account.Credentials.from_service_account_info(keys)  #connection client = bigquery.Client(credentials=credentials, project=credentials.project_id,) table_id = "spotfire-cloud-writeback.spotfire_dataset.spotfire_table" #"your-project.your_dataset.your_table"  #prepare rows to insert (script parameters) import datetime tstamp = str(datetime.datetime.now()) #comment = "Hello GBQ!" #val = 5  rows_to_insert = [     {u"date":tstamp,u"comment": comment, u"value": val}, ]  #insert rows errors = client.insert_rows_json( #    table_id, rows_to_insert, row_ids=[None] * len(rows_to_insert)       table_id, rows_to_insert )    #warn for errors if errors == []:     print("New rows have been added ?" else:     print("? Encountered errors while inserting rows: {}".format(errors)) 
     

     

    Configure Data Function parameters

    The script does not return any values at this point but input is required. The comment and val variables can be mapped to document properties

    params_0.png.a32d53f0194e9fcf3cc180ac0408e5ed.png

    Read back from Google Big Query

    If you do not have a shared connection that includes the BigQuery data that you want to analyze, you can create a new connection. Then you can make your own data selection from the BigQuery data you have access to. Use the .+ flyout and click Connect to and select Google Big Query. Press F1 for detailed instructions.

    readgbq.png.3a2a1c35e6ee3b043d222aaf7656918c.png

    See also

    References

     

     


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...