Jump to content
  • Image Recognition in Spotfire® using Python and AWS


    Image recognition is a very powerful tool that is used in countless industries as the machine learning explosion continues. It has applications in health and medical industries to help scan medical imagery, in manufacturing to assess for quality and errors, maintenance to detect failing equipment, intelligence and security to identify people, or objects, etc. It can even be used on satellite imagery to detect environmental issues or ship movements. To this end, I wanted to experiment with whether Spotfire® could be used to run image recognition models while providing a visual and interactive way to build, assess or simply review model results.  In this blog, I will tell the story of this work.

    This article is kept for reference - functionality in Spotfire has much evolved since the time this article was written. Please refer to the Spotfire Data Function Library article for the latest resources for using Spotfire and Python. 

    image-recognition-aws2-thumbnail_0.png.dd9231a61f9d5e0e4fbc6c392156d580.png

    Image recognition is a very powerful tool that is used in countless industries as the machine learning explosion continues. It has applications in health and medical industries to help scan medical imagery, in manufacturing to assess for quality and errors, maintenance to detect failing equipment, intelligence and security to identify people, or objects, etc. It can even be used on satellite imagery to detect environmental issues or ship movements. To this end, I wanted to experiment with whether TIBCO Spotfire® could be used to run image recognition models while providing a visual and interactive way to build, assess or simply review model results.  In this blog, I will tell the story of this work.

    As a data scientist and analyst for many years, I have largely dealt with structured, rectangular data in ever-increasing sizes. Historically, processing nontraditional data such as images, and text was a considerable task requiring much computational power, and lengthy coding or development times. However, as analytical and modeling tools have greatly improved, coupled with the advent of cloud-based computing and services, the opportunities for data science is much greater. We now suddenly have the ability to run complex pre-trained models utilizing massive computational power, with minimal coding or infrastructure required. 

    With the main cloud providers all offering image recognition services, I wanted to utilize their existing functionality in Spotfire® if possible. This would not only mean Spotfire could be used to run image recognition models but also connect to cloud services. 

    irm1.png.03f36a93859eced21a0b503e9bc778e7.png

    Summary of cloud provider's image services

    To build this image recognition Spotfire tool I would need to be able to do the following:

    1. Read images into Spotfire and extract metadata i.e. image name, dimensions, etc
    2. Submit images to a cloud service and handle the returned results
    3. Visualize the results in Spotfire

    To achieve this I would utilize Spotfire's extensive access to API's and languages such as IronPython/C#Python, and JavaScript.

    Read images into Spotfire and extract metadata

    To read the images into Spotfire I used one IronPython script. IronPython gives you the ability to not only access Spotfire's own API but also the C# libraries available in .NET. This opens an incredible potential to Spotfire users being able to perform tasks such as interacting with file systems and controlling Spotfire. Using C# libraries such as Image and various IO functions, I can read all images in a directory specified by the user, and extract metadata including location information.  This can then be displayed in Spotfire:

    irm2.png.444e02faa235c0d6c2073d785dd306c7.png

    Example of reading images in a directory into Spotfire

    As a later extension, I added an option to specify an S3 bucket to read images from and download to Spotfire, showing we can interact with AWS storage services also. This was done utilizing Spotfire?s Python data function. You can see the code for this in this article. More on this is below.

    irm3.png.b21dfeefe09f0c5f727bd062fe0fc050.png

    Submit images to a cloud service and handle the returned results

    I chose to use Amazon's (AWS) Rekognize service as I had previously tried this through their web interface and knew it could be called through Python or using the AWS CLI executable (note that you must first configure the AWS CLI tool before these options will work). It is possible to call executables through Spotfire?s IronPython functions but a much cleaner way is to use the Python data function. This allows you to utilize any python libraries, as well as pass data to and from Spotfire to Python. 

    Amazon provides a Python library called Boto3 which has functionality for the vast majority of their services. So we can call this from Spotfire opening up AWS's huge functionality to Spotfire. I then followed this guide from AWS to get the correct Python code and implemented this in Spotfire. This meant I could live submit many, or single images from Spotfire (as chosen by the user), to Amazon?s Rekognize model. The python to loop over rows of data and pass the images to AWS is relatively simple:

    irm4.png.aeddab547d29178696d9a1ce19454de3.png

    Example of Python code to loop call the AWS service

    The output from the Rekognize service is a JSON object which contains all the identified labels i.e. objects, the confidence of this detection, and the coordinates of any detections, with the latter only being provided in certain circumstances. The Spotfire python data function was then written to flatten this JSON data into two tables:

    1. A table of labels and confidence in each image  
    2. A table of coordinates of any bounding boxes for labels found in each image

    We can now display the results in Spotfire from AWS.

    Visualize the results in Spotfire

    In Spotfire, I used map charts and bar charts to display the data returned. Spotfire's map charts can be used to display images rather than actual geographical layers, and you can plot data upon these using coordinates relative to the image dimensions. This means using the bounding box data returned from AWS, we can not only view the image but display where labels were found exactly. Here is my output in Spotfire:

    irm5.thumb.png.f5944f679534a59d460bcbba93c204ee.png

    Example of using a map chart to display 9 bounding boxes of persons identified in the image

    To make the image recognition interactive in this Spotfire application, I used markings i.e. the ability to select by clicking on data point(s). This lets the user control which images are displayed in the map chart as shown above, and are sent to the AWS image recognition service. Since Spotfire gives you the ability to trigger a Python data function based upon markings, this interaction is possible. You are then able to mark/click on any row or bar in the charts on the right that have bounding box data, and these will be highlighted as blue rectangles on the map chart. I also utilized an IronPython script to control the configuration of the map chart allowing for the correct background image to be used, and image dimensions set. 

    irm6.png.efcc638d186569e629208d970893e29f.png

    Summary of the interactive process

    I now have a complete image recognition application running through Spotfire utilizing Amazon?s Rekognize service. However, since I have access to python there is nothing stopping me from implementing my own model, or any other Python-based modeling libraries such as Tensorflow or Keras, should I wish to do so. 

    My final step was to expand this to a passion of mine which is nature and the environment. I wanted to test whether I could analyze images from Explore.ORG's live bear cam at the Katmai river in Alaska. My idea was to produce a timeline of when bears appeared on camera and in what number. Here is the application I produced after capturing images at 10-minute intervals over an evening:

    irm7.thumb.png.098f08dfef61c4ea0b0ad966c25fc04f.png

    Timeline of detected animals from the video feed

    Here I have utilized the date-created property of each image to produce a timeline shown in the bar charts (showing 24 hours and minutes). Each bar represents an animal type identified and how many. We can see it identified many bears and birds showing a distinct pattern for when more bears were on camera. We also see some unusual identifications of penguins, a cow, and dog, showing you must always test and understand a model's capabilities before using in a real production environment! However, given this is a generic image recognition service, working on low-quality images, the results are still impressive. 

    I hope you have enjoyed this blog on how I was able to perform image recognition in Spotfire, making it an interactive and visual process while showing that we can also call out to cloud services such as AWS quickly and easily in Spotfire also. 

    You can watch a live and full explanation of how these examples work on 

    Colin Gray - Spotfire Data Science - Aug 2019.


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...