Posts Tagged ‘datamodel’

The Trouble With Totals – those pesky accumulations

February 5th, 2016

ST_TroubleWithTribbles

 

 

The trouble

I don’t know about you, but I find it VERY easy to run into trouble (a.k.a. BUGS) when implementing accumulations in charts in Qlik. I’ve probably made every mistake possible at one time or another. Especially when working with averages, it can be very easy to make an “unnoticed mistake” … once again, the Bug-word.

I’ve gone back and forth with various techniques over the years. Henric has a nice, comprehensive post on the topic at his Qlik Design Blog. I’ve recently been striving to use an AsOf table in my apps, due to its easy of implementation. But I’ve found some serious shortcomings in that approach that I’ll address here.

Speaking of ease of implementation … Rob Wunderlich has a subroutine for the AsOf table in his awesome Qlikview Components library. And if you want to see the guts of building one, Barry Harmsen walks through an example in his Data Modeling class at the MastersSummit for Qlik.

The setup

To motivate this discussion with a typical business reporting scenario, let’s suppose we have daily sales amounts and we need to report both monthly sales, as well as a six-month rolling average. I bet 99.9% of you reading this are nodding your head, having been asked this question before.

One key requirement is that the user needs to be able to focus on a particular year (i.e. select a year in a list box) and see proper rolling-six month results … this is where it gets tricky.

Some approaches to accumulating

Built-in accumulation

The built-in accumulation feature (found on the Expressions properties tab) sure is easy to implement. It only requires clicking a checkbox and then choosing how many “buckets” to accumulate. These buckets are dimensional slices in the chart’s “mini-cube” of data. (Yeah, I know, I said “cube” in a QlikView post – BAD! But that’s how I try to visualize the chart data.)

builtin_accum_props

Note that since the accumulation only sums the six rows, they need to be divided by six to get an average. The variable R is assigned a value of 6 in the load script.

LET R = 6;

But as might be expected, with the super-simple setup comes very little in the way of value. The accumulation is limited to the data with the chart, specifically the values on each row. As soon as a selection is made, you’ll see that the accumulation is re-calculated, based on the selection scope. Boo. Hiss.

builtin_accum_2

In the example here, the chart on the left is detached (in truth, I’m using alternate states). But the chart on the right is responding to the Year list box. When 2011 is selected, the accumulation breaks because it can no longer “reach into” the 2010 data to calculate the rolling six-month average. I call this the “ramp-up” effect.

Inter-record chart functions

Let’s try a completely different approach — the chart inter-record functions. Specifically, we’ll use the above() function to reference prior rows in the chart. And then use the rangeAvg() function to average over these rows. You can probably already guess where this is heading …

The rolling six-month average expression is:

rangeAvg(above(sum(Sales), 0, 6))

Although we had to do a little more work on this approach, the result are no better. The chart still suffers from selections.

rangeavg_above

It’s at this point where I usually give up with the front-end only approach. However, I was recently shown something by one of our Masters Summit attendees where he successfully wrote an expression (a very, very complex expression!) that handled the calculation. It made my head hurt just to look at it. :)

So let’s look at some things we can do in the data model …

AsOf table

The AsOf table is a dimension table that I think of as a “half-Cartesian” table in that it creates a row in the table for each month, combined with every month that came before it. The mathematically-inclined might think of it as a triangular matrix.

So in our toy model of a Calendar dimension and Sales fact table, the AsOf table will associate to the Calendar on the [Year-Month] field. And each Year-Month in Calendar will generate a multiple rows in the AsOf table going from that month, back to the first month in the database.

asof_model_1

The field [AsOf Months Ago] is handy for limiting how many rows from the AsOf table to include. So for our rolling six-month example, the expression for the chart is:

sum( {< [AsOf Months Ago] = {"<6"} >}  Sales ) / R

We use less-than here (as opposed to less-than-or-equal-to) because the month itself is 0 (zero) months ago. That is, our rolling six-month calculation considers the current month, plus each of the five previous months. (Note the R variable is still needed in the divisor to get from the sum to the average.)

Great! Right? Well … for this to work, then our chart dimensions AND our selection dimensions (list boxes) must use the AsOf table fields. Huh?!

It is each AsOf Month-Year that links to the previous years. So [AsOf Year-Month] becomes the chart dimension. And any selections must be made in [AsOf Year-Month], not [Year-Month].

A handy field to include

I like to have a Year field in the AsOf table too. This is not included in the QVC implementation but is certainly easy to add in the load script:

JOIN
([AsOfTable_Year-Month])
LOAD
[AsOf Year-Month],
year([AsOf Year-Month]) AS [AsOf Year]
RESIDENT
[AsOfTable_Year-Month];

With this additional field, I can now make selections by creating  a list box on [AsOf Year]. Finally then, I can select a subset of the chart and still get the same, accurate rolling six-month average. Yay.

asof_result_1

But I’m still troubled

The AsOf solution looks great at this point … until it is needed in a real application, not just a simple toy model. In a real application, there will no doubt be at least one significant date dimension, by which most filtering is done. This is typically manifest in Year, Month, Quarter, etc. list boxes in the dashboard.

For example, in our model, let’s assume the Year field is in a list box at the top of every sheet in the app. That simply will not work with the AsOf table. And vice-versa — the [AsOf Year] field will not work with the non-rolling expressions.

Here you can see that selecting Year = 2011 breaks the rolling six-month average:

asof_result_2

 

So now what … ??

Well, I’m going to pull a “who shot J.R.” on you and save that for my next post! :)

(and if you don’t know who shot J.R., it’s worth a Google)