Monday, February 24, 2014

Home Energy Consumption Analysis with Tableau

Welcome to another entry in our guest blogger series, this time by Bronson Shonk on the Product Consulting team at Tableau. This exploration is so well-documented that I am going to just stop here. Except to say: great design work! I especially like the extensive help and thoughtful explanation.

Wednesday, February 19, 2014

Divvy Data Challenge Using Tableau

I had a couple of free hours the other day, so I decided to create and submit the following "mini application" to divvybikes.com and their recently announced data challenge. I am always reticent to enter contests, lotteries, or what have you - because I never win dammit! ... anyways, let's see what those Chicago startup folks think. Yet another fun waste of time using Tableau. Enjoy!


I want to:

See when and where riders are going

See station mapping info

Review rider demographics

Looks at a calendar

Research the bike I am using

Look at station distances

Study capacity planning

Friday, February 14, 2014

Individual Query Auditing with Tableau

Tableau Server has great built-in audit tables for who has looked at what content and when. We discussed this briefly in a previous post related to performance monitoring. What is NOT built-in to Tableau Server, however, is the ability to audit detailed field-level usage, or put in plain terms: who filtered what? (in other words, what was the resulting WHERE clause and WHO ran it?)

It turns out that if you are clever you can get this to work. Here's the brief overview of steps required:

  • In your database, enable tracing - this is not covered in this blog post!
  • Be prepared to report on this tracing. This could be with Tableau Desktop, or some other tool. In the example below, we used SQL Server and Tableau Desktop and only those two tools.
  • On all of your visuals, create a calculated field that tracks the current Tableau Server "Username". You also need to add a unique string with which to grep the answers later on.
  • Optionally, you can add this calculated field to Tableau data source filters - read more about that feature here and here. This is super awesome because it means that you will have enabled detailed auditing not just for Tableau Server web users, but also for any Tableau Desktop users as well!
  • In the example shown below, because we have a calc which results in "SPECIALTRACKINGKEY"+Username, we can then filter out all of the noise in our tracing data, and focus only on the Tableau query usage. Nifty.

In order to publish this workbook to Tableau Public, we first had to delete the magic calculated "tracking field" (Tableau Public does not support user filtering, and nor should it). Here is the definition of that field:

"SPECIALTRACKINGKEY" + Username()

You will need to add this back into your example in order for any of this to work. This subject is generally not for the faint of heart, so take your time and best of luck. Enjoy!

Individual database information for enabling SQL tracing (I'll try to add to this list as time permits):

Tuesday, February 11, 2014

Tableau Server Performance Monitoring

Analyzing Tableau Server performance is a complex beast. This is due to the fact that no one installation is like any other installation... makes sense. Tableau Inc. has some great built-in tools including the performance recorder as well as a white paper wherein they benchmark a given installation and response times.

But now what? This doesn't help us because as mentioned... all systems are different. A quick search on the web doesn't reveal much. It turns out that there are a few different disparate data sources that we can blend together into a unified view in order to analyze Tableau Server performance. They include:

  • Windows Performance Monitor
  • Tableau Server HTTP requests
  • Tableau Server audit events and
  • Tableau Server background tasks.

Here is a short list of steps to get this type of unified view; at which point, you could download the workbook shown below and swap out the data sources using the replace data source feature.

  • Enable the Custom Administrative Views feature of Tableau Server.
  • Learn about and enable Windows Performance Monitor. Tableau has a KB article to read. The workbook shown below uses 5-second intervals.
  • Take note that Tableau Server uses GMT Time. The workbook shown below uses a Tableau calculation called "Tableau Timestamp" to offset all relevant dates by 8 hours to pacific timezone in order to match my Windows perfmon timestamps. You will need to change this to your particular timezone.
  • The CSV output of your Windows Performance Monitor data collector will have really ugly field header names like "\\machineName\processName\PrivateBytes" - I changed these inside the workbook shown below - you will need to ensure that you can correctly swap the field names out with your own data.
  • Also note that the workbook shown below came from my laptop. I extracted all the data sources and inside the extract I filtered to "today only" which was Feb 6th, 2014. Obviously, you will want to disable extracts, or re-extract accordingly.

What's the point? Well here's the thing, folks ... instead of trying to come up with some repeatable statistic e.g.

  • "Total private Vizql RAM divided by number of unique users logged in" - this fails because you cannot predict how long a particular user sits on a viz, or, what the viz looks like, or, what the data source looks like, or
  • "Total RAM divided by HTTP requests" - this fails because there are lots of http requests for a given viz, or
  • "Total RAM divided by distinct users" - this fails because not all users are doing the same thing at the same time, or
  • Any number of other wacko stats.
...Instead of trying to do any of that, I recommend simple "immersion analysis". Immersion analysis is covered from several different angles in Dick Hauer's seminal work Psychology of Intelligence Analysis (it's a great read for numerous reasons - also free as a digital download, and no, I do not work for the CIA :) The basic premise of immersion analysis - specific to the Tableau Server platform and performance analysis - is that you should not be looking for a discrete and conclusive answer to the question of performance, e.g. "this thing times that thing must equal this other thing". Instead you should be using the available data to gather hypotheses with which to perform further research. Specifically, you are looking for:
  • Time Patterns: is there a spike or peak at a recurring or periodic time? (this is achieved with any of the data sources shown below)
  • Content Patterns: is there a particularly painful workbook or dashboard? (this is achieved by cross referencing RAM spikes against the audit tables. You will want to see which workbooks or dashboards were being looked at when the RAM spiked)
  • Task Patterns: Same as above, but this time you are cross-referencing background task details against RAM spikes (e.g. is a particular "refresh extract" pinning down Tableau Server?)

It might be considered a fruitless task to come up with known-good and reproducable performance benchmarks with the Tableau platform. Instead, you should gather information, come up with a hypothesis (e.g. a likely root cause for a performance spike), and then research the heck out of that hypothesis. Rinse, repeat, as needed. Download the workbook shown below to get started. Enjoy!

Thursday, February 6, 2014


Tableau - The Bachelor Viz

Next up in our series of guest publishers... I present to you something which would have been quite impossible for me to have designed and built myself. :) All I can say is WOW what a design effort!

Created by expert Tableau Product Consultant Alex S. with help from additional Tableau staffers in the marketing department Maureen, Aliana and Mike, I consider the dashboard shown below to be an exquisite work of art which still provides a fundamentally solid data analysis experience. A beautiful meld of form and function, to be sure. What an excellent viz, Alex and bravo to Team Tableau in general!