An interesting thing happened at work today — well interesting for me at least since I dig bits of math that most find dreadful.

A co-worker averaged some averages. Well, not exactly, but it’s close enough that the discussion here holds. (If you want to get super technical he averaged P90 numbers)

The end result looked questionable. Looking at the numbers visually would lead to you think the answer was around 1.5. The average was around 8. So, what went wrong?

The reason things broke is that in averages (or Pnn) the function operates over the whole of the input data. For averages, you need to take the sum of all of the inputs and then divide by the count. If you sum averages the weight of each sample can be skewed substantially.

Take for instance this set of lists:

  • 1, 2, 3, 4, 5
  • 3, 3, 3, 3, 3
  • 25

The averages of each list is (3, 3, 25). If you average the averages the answer is 10.333.

If, however, you average the numbers individually, the true average is 5.

25 in the first case was effectively given five times the weighting of any of the other numbers that are being averaged.

In the P90 case you have a trimmed exceedance probability. Any outliers in the entirety of the dataset would be trimmed. If one simply averages P90s, you wind up with the outliers in one bucket skewing the results disproportionately.

Just something to keep in mind in case you’re dealing with things like this.