Jump to content
  • Spotfire Primer - Blog 3 - OVER Expressions


    This is the third of a series of Blog posts showcasing concepts discussed in the book - TIBCO Spotfire, A Comprehensive Primer, by Andrew Berridge.

    In this post, I'll be looking at OVER expressions, perhaps one of the most powerful, but least understood features of Spotfire. The book covers OVER expressions in Chapter 7 - Source Data is Never Enough. It starts off:

     

     

    Slicing and dicing data using hierarchy nodes
     
    As we mentioned previously, Spotfire expressions do not work in the same way as Excel formulas. Expressions must work over the whole of the data that's currently in scope (be it the whole data table or a grouping within a visualization).
     
    Spotfire provides a set of OVER functions to allow you to aggregate data over the nodes of a hierarchy in a data table. At its simplest level, the OVER function is a means to define at what granularity you wish to aggregate values.
     

    To expand a little on some key statements - in most spreadsheet tools (such as Excel), you can reference individual cells within a worksheet. Spotfire is a little different in that data tables and the data shown in visualizations are a bit like database tables. You can only perform calculations and other operations over the whole dataset. 

    Aggregations are calculations that are applied to the dataset and the groupings of the data within them. For example, you might show total sales for each day in a bar chart. You'd apply the Sum aggregation on the values (y) axis and Spotfire will apply that Sum to all the sales data for each of the days shown on the categorical (x) axis.

    To take this a little further, when Spotfire calculates the heights for the bars, it's applying the OVER function in the background - the expression would be something like

     Sum([sales]) OVER ([Date])
     

     

    This would produce a bar for every day of the month. If we wanted to show sales by month, Spotfire might use the expression

     Sum([sales]) OVER (NavigatePeriod([Hierarchy.DateHierarchy],"Month",0,0)) 
     

     

    I'll explain a bit more about the NavigatePeriod method, but first let me discuss the concepts of nodes and hierarchies!

    The Spotfire help for some of the OVER functions talks about navigating different levels in the hierarchy. It also talks about different nodes within the hierarchies. These terms can be difficult to understand without some context!

    First of all - all data in Spotfire is hierarchical. Some example hierarchies could be date/time (Year, Month, Day, Hour, Minute, Second, etc.), or regional, e.g. Continent, Country, State, County, etc.. Each level of the hierarchy is more detailed than the last. When you have common values at a level in the hierarchy, these represent the nodes in the Spotfire data. So - 2007 would represent a single node at the highest level in the hierarchy. January 2017 would represent a node at the next level down. It's perhaps easiest to visualize this in tabular form - here's a cross table that shows the point (extracted from the book):

    pasted_image_0_11.png.278855ec8ebd090ccd7601185f17b142.png

    Each cell that you see in the cross table represents a node at some level in the hierarchy.

    Notice that in the last Spotfire expression I used above, I have used the Hierarchy keyword. What's that all about? It's a little-understood keyword in Spotfire custom expressions? Here's the expression again:

     Sum([sales]) OVER (NavigatePeriod([Hierarchy.DateHierarchy],"Month",0,0)) 
     

     

    In order to get this to work, I needed to create a custom date hierarchy, with Year, Month, Day, etc.. I just added a new hierarchy called DateHierarchy by choosing 'Add Hierarchy' from the Data menu in Spotfire. Hierarchy.DateHierarchy is how we refer to that custom hierarchy in the Spotfire expression language.

    The NavigatePeriod OVER method (function) sets the levels at the hierarchy in which to navigate. The first argument to the method is the name of the hierarchy. The second argument is the level at which to select from the hierarchy, and the last two arguments specify the level to navigate sideways in the hierarchy and the number of steps to navigate down. Sideways is the number in the hierarchy, e.g. the previous month or next month and down is the level in the hierarchy. So if, for example, I had chosen -1, 0 respectively, then the sum would be calculated for the previous month. If I had chosen 0, 1, then the sum would be calculated at the day level. 

    I?ve only given a brief taster here. The book goes on to explain that you can use another visualization to test that your OVER expression is correct. Here?s an example of where the book discusses this for the PreviousPeriod method, when looking at flight delay data:

    pasted_image_0_1_2.png.dcf4b2499bc74bc6dfdca36b8d5c14ee.png

    Another couple of tips from the book:

     
    Be careful when using OVER aggregations in calculated columns, you will most likely get duplicated values as results (which you will need to aggregate with Min, Max, First, and so on) when displaying them. However, they are extremely useful when building up complex calculation strategies.
     
    The aggregations in calculated columns will also NOT respond to filtering. Again, I make no apologies for repeating this fact. It's just so important! 
     
    If you're struggling with constructing a calculated column that uses OVER expressions, it can be helpful to build the expression (or parts of it) as a custom expression on a visualization axis and then translate it into a calculated column expression. For example, you'd replace [Axis.X] with the name of the column that contains the hierarchy you are working with.
     

    The other thing you can do is to use one of the bult-in advanced aggregation methods on the value axis selector of a chart (e.g. Cumulative Sum) and modify it from there. Please also see the previous blog post  TIBCO Spotfire Primer - Blog 2 - Hybrid in-database/in-memory aggregations.

    I really recommend that you dive into this Chapter 7 if you're interested in performing OVER calculations in Spotfire!


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...