Archive for November, 2010

Color Mixology 101 & Heat Maps

November 30th, 2010

I used a great QV feature for the first time this week – the Color Mix Wizard. Insofar as it is indeed a Wizard, there’s not a lot you need to know to use it, but I’ll walk through the steps I took…

The Grid Chart

The objective of this exercise was to come up with a visualization for looking at the number of survey respondents based on their answers to two different questions. My first attempt was to use a scatter plot, but that failed miserably since my dimensions (the x- and y-axes) were not continuous numbers, but rather ordinal values. Even when I encoded those ordinals as integers, the results were not what I was looking for.

SO…enter the grid chart.

Having worked through this example helped me clearly articulate the use case for the grid chart – basically, it is a version of a scatter chart that works for categorical values, be they nominal or ordinal. Nominal values are labels like “US, UK, FR” or “Finance, Sales, Operations.” Ordinal values are also categories, but can be arranged in a ranked order like “Under 25, 25-30, 30-35″. (for more info check out this Wikipedia article Level of measurement)

Here is my grid chart with two dimensions and bare-bones formatting:

Try as I might, I could find no simple way to make each horizontal row the same color…sigh…

Grid Meets Heat

So that grid chart tells me relatively how many people responded to each combination of question 1 and question 2. And it also provides some correlation information – for example, there are a large number of respondents falling into x-category: “18-25,” y-category: “4-6 hours.” This information is useful in and of itself, but I also wanted to compare it to their responses to a third question – a simple YES/NO question.

I could have opted for the pie chart style (from the Style tab) … but I’m already breaking Stephen Few rules by encoding magnitude as the radii of the bubbles :) … best not make it worse with pies. Moreover, I personally don’t care for the aesthetic of the pie grid. Regardless, had I chosen this route, I could have simply used the third question as a third dimension and the pie chart would have proportioned two “slices” of pie – one for YES and one for NO. And included a legend too.

Instead, I decided to use a varying color saturation on grid bubbles. Since this will require a single number (vs. two categorical values – YES/NO) I needed a way to encode the survey responses to a single number. Hence I chose to take the ratio of YES responses (i.e. # of YES responses divided by # of ALL responses) which results in a floating point number between 0 and 1.0.

Now with a single number I can use it as a parameter to control color saturation. For example – I could simply use it as the alpha value (multiplying by 255 or similar), but I wanted finer control … especially at the “light” (faint) end of the scale. This is where the Color Wizard comes in!!

The Wizard (cue Black Sabbath)

First of all, it’s not obvious where to find this wizard…

  1. Expand the properties on the expression
  2. Click on Background Color to open the Expression Editor
  3. Go to the File Menu and you’ll see the Color Mix Wizard… menu item

Take some time to play with the wizard – it may take a few passes to get the effect you want. A couple of key parameters to explore:

Enhanced Colors – depending on the distribution of your data you may or may not want to select this option. It’s effect is analogous to a stereo enhancer for audio in that it takes the “middle” and pushes it out to both “sides.” Which in our case means taking the middle values and pushing them more towards to the upper and lower ends of the colors you’ve chosen.

Value Saturation (upper/lower) – again, your use case will dictate how to set this. If your data has long tails, you may want to set these limits so they group the outliers into a common color.

The Color Wizard will generate a piece of code that should look something like this:

=ColorMix1 ((rangemin(.8,rangemax([Legend Value] ,.1))-.1)/(.8-.1), ARGB(255, 242, 242, 242), ARGB(255, 128, 0, 0))

You can identify the Value Saturation range values (0.1 and 0.8) and the RGB values for the colors, so if you need to tweak the expression, you can do it directly on the code, instead of re-running the wizard.

NOTE: if you use Enhanced Colors you’ll get a much lengthier piece of code!

So now I have a true heat map, with the darker red bubbles indicating more YES responses to Question 3:

Finishing Touch – The Legend

Finally, it would be nice to have a more quantitative explanation of the heat levels besides the text: “darker means more YES responses.” So, a simple legend can be easily implemented in a straight table chart. This table will have just three columns:

  • the legend label – like “> 10%”
  • the legend value – 0.10, 0.20, etc.
  • a numeric expression (required to use the Chart object)

The expression really doesn’t matter because we’ll “white” it out (i.e. white text on white background).

Here is the load statement for the inline table:

LegendSaturation:

LOAD * INLINE [

Legend Label, Legend Value

< 10%, 0.1

20%, 0.2

30%, 0.3

40%, 0.4

etc...

 

];

After reloading the script, create a Chart object and select the Straight Table option. (I like Style = Table 1 for this example.) Choose Legend Label and Legend Value as the two dimensions. The expression column doesn’t matter, you can use something simple like only([Legend Value]) or sum([Legend Value]).

Now apply the colors…

  1. Expand the formatting options on the Legend Value dimension and edit the Background Color – you can either use the Color Mix Wizard again, or easier still, just copy the Expression from the Grid Chart object.
  2. Edit the Text Color option and paste in the same color expression from Background Color (this will render the text “invisible”)
  3. Go to the Expressions tab and expand the formatting options on the metric and edit the Text Color expression setting it to white() (assuming you’re using a white background)

That should do it! Resize the columns a little bit to balance the shape and you should have something that looks like this:

We Have Arrived

Finally, putting it all together:

Great v.10 convenience feature – moving objects

November 10th, 2010

Anyone who has worked on front-end development tasks in QlikView knows what a drag it is trying to reposition objects without a caption bar. Well…those days are done! … with version 10.

In version 10, objects without (or with) captions can be easily moved by holding down the ALT key and dragging them with the mouse.

Highlighting a “special” dimension in bar charts

November 10th, 2010

A great way to provide your end-users added insight from a bar chart is to highlight a special dimension value.

What is a “special” dimension value?? For example, if you’re reporting on market share between your company and competitors, you should highlight “Our Company” on the chart. Stephen Few has a nice example on this topic in “Information Dashboard Design,” Chapter 6.

Another example would be if you’re a Product Manager and are looking at sales figures across various products in the line. By highlighting the specific product that you manage you can see where it stacks up against the rest … at a glance.

Imagine we really want to focus on Product = Alpha, and let’s highlight it with a saturated red bar:

So, naturally we need to figure out how to do this in QlikView, right?!

Controlling Background Color

Fortunately there is a nifty feature in the Expression Properties of the Chart object that allows you to dynamically control the Background Color of the bars. And because it itself is an expression you can use if-then logic or any other expression you can conjure.

Here I’ve chosen to use the red() function, but of course you can use any similar color function or rgb().

So this solution is just fine, except it requires that the dimension of interest be hard-coded into the expression in the chart properties. That may be okay for static analysis (like the market share example above – our company vs. our competitors). But for something that could change frequently (like Product) it would be a much more sustainable solution to be able to dynamically control the dimension of interest.

Dynamically controlling the dimension of interest

One way to make the dimension of interest dynamic is to store it in a variable. And then drop an Input Box object on the page. In this example we’ll call the variable varHighlightedProduct.

Now with a simple change to the Background Color expression –>

if(Product = $(varHighlightedProduct), red(255))

we can use an input box to control the highlighted dimension.

Another way to dynamically control the dimension of interest

The variable method works well enough, but changing the value in the input box requires a little bit of typing. Wouldn’t it be easier if there was a list box object in which the end-user could just select the dimension value of interest?? (answer = “YES”)

To build a separate list box we obviously need a data field. That can be easily accomplished with a resident load. Let’s suppose the sales data are stored in a table MySales, and the dimension field for product name is simply called Product. Then our load statement looks like:

HighlightedProduct:

LOAD DISTINCT Product AS [Highlighted Product] RESIDENT MySales;

This will load the unique values of Product into a table called HighlightedProduct with one field, [Highlighted Product].

Next, modify the Background Color expression so that it’s comparing against the selection in [Highlighted Product] –>

if(Product = [Highlighted Product], red(255)).

(You could get fancier with this code…allow multiple selections, restrict the list box to a single value, etc.)

So now we have a very convenient way for the end-user (the Product Manager in this example) to switch the dimension of interest using a simple list box.

I hope you find this simple example useful. Let me know if you have other ideas around this topic!

Associative search in QlikView v.10 list boxes

November 10th, 2010

Okay, I confess I didn’t really “get” this feature when I saw it in the webex demo. But after seeing it again at Q Days in San Diego last week, the lightbulb finally turned on for me.

I’ll do my best to explain it, but I highly recommend that you go check it out yourself at the QlikView v.10 demo site. The application “What’s New in QV10″ is a good place to start.

Direct vs. Associative Search

In version 9 a “simple” search function can be achieved by dropping a search object on the page. And then you can configure the properties to search across all fields or narrow it down to specific fields. This is a very effective tool and end-users love it. But it’s not really associative. By that I mean, it’s a direct search; i.e. if I’ve set up the search object to search in the Salesperson field, then I enter some portion of the Salesperson name in the search box. For example, if I know the salesperson’s name is Sharon something, I type “*sha*” and a list of Salespeople with the substring “sha” in their name shows up. Make sense?

But what if I don’t remember any part of the Salesperson’s name, but I remember some other things associated to the salesperson. Like what they’ve sold. Or to whom they’ve sold …

In version 10 associative search, I can search for Salesperson by entering keywords about other fields that are associated to Salesperson (like Customer, Product, etc.) So in the example above, if I couldn’t even remember any part of the Salesperson’s name, BUT I did remember she works in Germany (Business Unit) and sells Milk (Product) and sells to ActiveDesign (Customer) then I can use associative search and hopefully zero-in on her…

A Use Case

This series of screen shots walk through the scenario I outlined above.

Step 1 – I open the search bar on the Salesperson list box (the magnifying glass) and type “germa”

Step 2 – I click on BusinessUnit_EN = GERMANY

Step 3 – I then type “milk” in the same search bar

Step 4 – I click on Product Sub Group = Milk

Step 5 & 6 – Finally I type “activ” in the search bar and click on Customer = ActiveDesign

Step 7 – with these associations, my list box has been reduced to one value – Sharon Carver –> I have my answer!

Step 8 – Optionally, I can explicitly select Sharon Carver and filter my whole model to her for further analysis

A Couple of Points

  • IMPORTANT – the selections you make in the search list do not affect the rest of the model. In other words, when I selected BusinessUnit_EN = GERMANY in the search box, my dashboard didn’t change. (You can watch the Current Selections to verify that.) I suppose that depending on the scenario this could be slightly inconvenient. But I really find it to be useful, because I don’t want to change my model selections…I just want to find Salesperson = Sharon Carver, in this example.
  • I found myself having to type slower than normal in the search box. Not a big deal, but it took a while to get used to. (Maybe it will be more natural in the final v.10 release.)

So…another great feature for end-users coming in version 10!

QlikView under the hood

November 10th, 2010

One of the great sessions at Q Days 2010 San Diego (yesterday!) was “QlikView Under the Hood,” presented by Daniel English of the Product team.

I’m a scientist at heart and really like to know how things tick. Sure, it’s rewarding to deliver a stunning dashboard to my business partner and hear them say “wow” or “we didn’t have this type of intell before” … but, in the back of my mind I’m wondering “how much RAM is this really consuming?”  :)

So, this session addressed some questions. The two big takeaways for me were:

  1. QV Server (QVS) caching – if you happen to be sitting around, watching the Perf Mon on your QVS machine you will likely notice RAM activity growing and growing. And then freak out that QVS is leaky. Well, it turns out, this is exactly the correct behavior for QVS. What happens is that when a document is opened, and the user starts clicking or changing tabs, each of those results (aggregations) get cached. And QVS will continue to cache until all the available RAM is consumed or the document times-out from disuse. (Or if the QVS is rebooted – duh.) If another document comes into scope, or another user starts asking different question of the same document, then QVS starts dropping cached aggregates to make room for the new stuff. (How it drops it was not specified – oldest first? or least re-used first?)
  2. “Associative” – what does this really mean? Is it just marketing hype? Without debating the semantics, QlikView’s model is a very unique, very novel concept. Here’s how I think of it, in good old-fashioned SQL terms… Imagine I write a SQL query with every table in my data mart – every fact, every dimension. And I have join conditions to join every table (using a full outer join). And then I start dynamically changing bits of the WHERE clause. That is what associative analysis means. And we change the WHERE clause in QlikView, figuratively speaking, by clicking on selectors.

So yeah, Associative really does mean something different than most SQL query tools. Unless of course your SQL tools allows you to query every table in your mart simultaneously :)

    QlikView version 10 – coming soon….

    November 10th, 2010

    I finally saw the demo of QlikView version 10 this week, presented by Anthony Deighton of QlikTech. And I am chomping at the proverbial bit to get my hands on it. The official launch date is 10/10/10 (cute).

    A number of impressive new features are included, and a few in particular stood out

    • Metrics in list boxes – extending the existing feature of allowing a frequency metric in a list box, you can now put in any metric (presumably through the Expression Editor which wasn’t shown). You can even do a microchart. I’m not entirely sure how this is conceptually different from the existing Straight Table object…perhaps the list boxes quicker to configure? At a minimum though, I could this as being a handy feature when profiling new data sources (I’ll take this up in a future blog topic.)
    • Container objects – a great way to better manage screen “real estate.” This will get around the hassle of setting up icon minimize/maximize cycles. And moreover I think the selection options presented in the Container object will be more obvious to business users who are new to any given application.
    • Mekko charts – hmmm… I’m not sure how useful these will be. I mean, I get the fact that now another “degree of freedom” can be encoded in a two-dimensional bar/column chart (by varying the bar width with an additional expression). I get it, bravo, well done. …But, I’m just not sure how easy they will be for the “average bear” to interpret. I’ll be interested to put these in front of some business users and get their feedback.
    • Extensions are a new feature that, well…extend the existing functionality. One of the extensions I could see using is the Gantt chart. The other ones I saw looked like more “eye candy” than anything. I’m sure Stephen Few will have some priceless and spot-on comments on these :)

    In addition to gobs of GUI goodness, there are also some impressive changes coming to the ETL scripting and server administration:

    • Multi-threaded reloads – yay! Now I can get my money’s worth on that 16-core processor!
    • Easier user administration in QV Server – my admin will love me!
    • Version control … although, before getting too excited on deploying this, it will certainly need a test drive (check out the commentary at Guerrilla BI).
    • Other improvements to the scripting tool – I heard “type-ahead completion” mentioned, but didn’t catch whether that was to be in v.10 or not

    SO…get ready…10/10/10 !

    Grand (re)opening

    November 10th, 2010

    Hi all -

    Some technical considerations with my hosting has prompted me to rebuild this blog. Rather than doing a full backup & restore of the database, I’m simply going to recreate the prior posts.