Jump to content
  • Spotfire Primer - Blog 1 - Practical Applications of Spotfire Visualizations


    This is the first of a series of blogs covering various aspects of the book - TIBCO Spotfire, A Comprehensive Primer, by Andrew Berridge.

    In this blog post, I'm going to introduce Chapter 5. Its title is "Practical Applications of Spotfire Visualizations". It's a guide to using various visualization types in Spotfire, with explanations and examples of when to use each of the visualization types and some common pitfalls to avoid. I particularly enjoyed writing this chapter because it gave me the opportunity to work with different visualizations in Spotfire and source examples from all sorts of datasets. It was great fun to deliberately produce some misleading/invalid bar charts to illustrate how NOT to use them. It was also very enjoyable to produce examples of the bar chart and scatter plot visualizations. It reminded me once again, just how powerful and flexible these chart types are.

    The introduction to the chapter is:

     
    Spotfire is universally adaptable. There's usually a way to display any kind of data in a meaningful visualization in Spotfire. However, how do you know which type of visualization to use? How can you configure a visualization to best represent the data that you're working with? How can you get the fastest results and insights from the data? What should you not do with each visualization type?
     
    In this chapter, we will cover the following topics:
     
    • Some real-world examples of some common Spotfire visualization types
    • What to use each visualization for
    • The pros and cons of the visualization types
    • Some configuration hints and tips
    • Common pitfalls and things to avoid
     

    The chapter then goes on to discuss the different Spotfire visualizations, explaining the pros and cons of each of them in a concise, easy to follow manner. For example, here's the introduction to bar charts:

     
    Bar charts are one of the most useful and versatile visualization types in Spotfire. Let's go over them here:
     
    • Good for visualizing:
       Any type of data that is split into categories. Examples of categories include the following:
      • Product category
      • Sales region
      • Car make and model
    • Don't use for:
       Generally, visualizing continuous data on the x-axis is not recommended (as you will see in the following example), unless you are interested in the general shape or trend of the data.
    • Pros:
       Really easy to construct, configure, and interpret.
    • Cons:
       If you have lots and lots of categories, there simply isn't enough space on the categorical axis to show all the labels, so you will need to use techniques such as zoom sliders and hierarchical axis selectors. See 
      Chapter 8, The World is Your Visualization
      , for more information on constructing hierarchies from axis selectors.
    • Summary: 
      Bar charts are the go-to visualizations in Spotfire. Think bar chart first!
     
    We typically use bar charts to represent numerical data that's split into categories. Bar charts are very useful for showing numerical values in an easy-to-consume way. Bar charts can also show several dimensions of data at the same time. You can choose to Color by values and trellis by columns, in order to produce an at-a-glance view of data that is both accurate and straightforward to interpret.
     
    There are some things to watch out for, though!
     

    So - what are these things to watch out for? Perhaps my favourite is being careful with summing averages! In my opinion, the Avg aggregation is generally far more applicable than the Sum average. In fact, Spotfire X introduced a preference for setting the default aggregation method to use. You can find it under Tools | Options | Visualization:

    screenshot_at_may_29_09-50-09.png.52825e15e1649d30fe47e533a283390f.png

    However, if you do set Avg as the aggregation for the values axis of a bar chart, then using stacked bars leads to an invalid visualization. Here's how the book explains it:

     
    ...we must be careful with summing averages.
     
    This is always something to be aware of. If you are using Avg as an aggregation function, then it probably doesn't make sense to use a stacked bar chart. Consider the following example:
     

    image_05_004.thumb.png.771bba280a445eeab282f5db1b55bfc1.png

     
    I've labeled the bars to explain why this visualization may be confusing. The preceding visualization shows the pricing of Airbnb data in a stacked bar chart. It is colored by Room Type. Notice how the bar segments show the Avg (price). That's what we want! However, if you look at the total bar height and the value axis (y-axis), you'll see that the average of the segments in each bar is summed to the total bar height. This doesn't make much sense. Summing the averages is invalid mathematically and misleading at best. It's far better to use a side-by-side, or 100%, bar chart:
     

    image_05_005.thumb.png.7f742be276ddc58e1f31090426fcc10f.png

     
    That's better! The chart is no longer misleading. I have left the labels on the bars, but it would probably be better to remove them for clarity. While we're on the subject, watch out when using 100% stacked bars; if you use this mode, you can only ever compare the bottom split of the categories directly, as they have a common baseline.
     
    Before we leave this example, notice how the bars are nice and wide. This is because I have set the x-axis to be categorical. By default, Spotfire sees the number of bedrooms as a continuous variable, since it's an integer. 
     

    The chapter continues with exploring the bar chart visualization in more detail - showing how to work with continuous and categorical x axes and how an integer represented as a categorical variable may lead to non-contiguous axis scales (with a suggestion as to how to avoid this). 

    Highlighting the bar-chart examples is just a taster for the rest of the chapter! It goes on to discuss scatter plots, cross tables, box plots, line charts and more. The example of the box plot visualization is particularly interesting to me, as it uses statistics to identify outliers and spot trends in global infant mortality data.

    The TIBCO Spotfire: A Comprehensive Primer - Second Edition, by Andrew Berridge is a great starting place for further exploration of topics covered in the book. Another related and very useful page on the community is: training - this is a launchpad for all sorts of great stuff! 

    This is the first part of a series of blog articles on the book - watch this space for more excerpts and related concepts. I'll be picking various topics and discussing them - providing introductions to those topics in some cases; expanding on the content in the book in others.

    Pick up a copy of the book from Amazon.


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...