In this post, I'll be looking at OVER expressions, perhaps one of the most powerful, but least understood features of Spotfire. The book covers OVER expressions in Chapter 7 - Source Data is Never Enough. It starts off:
To expand a little on some key statements - in most spreadsheet tools (such as Excel), you can reference individual cells within a worksheet. Spotfire is a little different in that data tables and the data shown in visualizations are a bit like database tables. You can only perform calculations and other operations over the whole dataset.
Aggregations are calculations that are applied to the dataset and the groupings of the data within them. For example, you might show total sales for each day in a bar chart. You'd apply the Sum aggregation on the values (y) axis and Spotfire will apply that Sum to all the sales data for each of the days shown on the categorical (x) axis.
To take this a little further, when Spotfire calculates the heights for the bars, it's applying the OVER function in the background - the expression would be something like
Sum([sales]) OVER ([Date])
This would produce a bar for every day of the month. If we wanted to show sales by month, Spotfire might use the expression
Sum([sales]) OVER (NavigatePeriod([Hierarchy.DateHierarchy],"Month",0,0))
I'll explain a bit more about the NavigatePeriod method, but first let me discuss the concepts of nodes and hierarchies!
The Spotfire help for some of the OVER functions talks about navigating different levels in the hierarchy. It also talks about different nodes within the hierarchies. These terms can be difficult to understand without some context!
First of all - all data in Spotfire is hierarchical. Some example hierarchies could be date/time (Year, Month, Day, Hour, Minute, Second, etc.), or regional, e.g. Continent, Country, State, County, etc.. Each level of the hierarchy is more detailed than the last. When you have common values at a level in the hierarchy, these represent the nodes in the Spotfire data. So - 2007 would represent a single node at the highest level in the hierarchy. January 2017 would represent a node at the next level down. It's perhaps easiest to visualize this in tabular form - here's a cross table that shows the point (extracted from the book):
Each cell that you see in the cross table represents a node at some level in the hierarchy.
Notice that in the last Spotfire expression I used above, I have used the Hierarchy keyword. What's that all about? It's a little-understood keyword in Spotfire custom expressions? Here's the expression again:
Sum([sales]) OVER (NavigatePeriod([Hierarchy.DateHierarchy],"Month",0,0))
In order to get this to work, I needed to create a custom date hierarchy, with Year, Month, Day, etc.. I just added a new hierarchy called DateHierarchy by choosing 'Add Hierarchy' from the Data menu in Spotfire. Hierarchy.DateHierarchy is how we refer to that custom hierarchy in the Spotfire expression language.
The NavigatePeriod OVER method (function) sets the levels at the hierarchy in which to navigate. The first argument to the method is the name of the hierarchy. The second argument is the level at which to select from the hierarchy, and the last two arguments specify the level to navigate sideways in the hierarchy and the number of steps to navigate down. Sideways is the number in the hierarchy, e.g. the previous month or next month and down is the level in the hierarchy. So if, for example, I had chosen -1, 0 respectively, then the sum would be calculated for the previous month. If I had chosen 0, 1, then the sum would be calculated at the day level.
I?ve only given a brief taster here. The book goes on to explain that you can use another visualization to test that your OVER expression is correct. Here?s an example of where the book discusses this for the PreviousPeriod method, when looking at flight delay data:
Another couple of tips from the book:
The other thing you can do is to use one of the bult-in advanced aggregation methods on the value axis selector of a chart (e.g. Cumulative Sum) and modify it from there. Please also see the previous blog post TIBCO Spotfire Primer - Blog 2 - Hybrid in-database/in-memory aggregations.
I really recommend that you dive into this Chapter 7 if you're interested in performing OVER calculations in Spotfire!
Recommended Comments
There are no comments to display.