I just met with a statistical question in a stats forum that had a surprising relevance to my own work and maybe the story can be of interest to more people, struggling with a difficult domain of process improvements : IT.

The question I had was about how best to model the distribution of resolution times for IT tickets. The statistical challenge is nice, because of several factors. The data will be severely skewed, with lots of very short resolution times and some that take extremely long. The skewed distributions we generally use – Weibull, Lognormal, Loglogistic … will not be able to correctly describe this distribution. The mode (the peak for non-statisticians) will be definitely less high and the extremes less pronounced in the standard distributions then what we see in such cases, thus, the fit will be bad. There is a family of methods called “zero inflated regression” that work with such distributions – but this is definitely a huge overkill for a GB.

The more interesting question then, is, WHY do we even want to model these times? The answer I got, when asking, was : “I am using Lean Thinking to reduce the resolution times and I am applying the DMAIC methodology” said my discussion partner. My reaction was pretty much a surprised “WHAT??” but then this got me thinking. It is not unusual, especially for manufacturing organizations that have a successful Six Sigma initiative, to simply attempt transfer the learnings from manufacturing to the IT world by enforcing the DMAIC methodology in this context. This might work, but in most cases it will not, and here is why:

In a manufacturing context we generally want to build a statistical model of the Y based on input factors (aka X-es). The input factors in most cases are physical quantities like pressure, temperature, concentrations etc. The effect of these factors on the particular process -machinery, reactor , whatever – is mostly unknown, even to the operators, so running different types of data analysis and DoEs make absolute sense. To do this we better identify the process well from the beginning, and this involves a correct modelling of the output variable as well.

Now, think of trying to shorten the resolution times in the IT Ticket resolution process. There are basically no physical factors with random variations, that influence the outcome (like temperature or concentration or whatnot) . This means, that enforcing a statistical model will be a doable but pointless exercise – e.g. we can spend weeks in building a zero inflated regression model, just to prove that statistically significantly the number of times the level two support is involved will have a positive effect on resolution times (no joke). The question is: was this time well spent? More interestingly – will such an analysis , which would make a LOT of sense in a manufacturing context – help us enlist the support of the IT teams and IT management ?

Having been on both sides of this fence I can confidently say, it will not, moreover it will cause a lasting tension between the IT and process improvement teams. The six sigma team will conclude that the IT is a nest of arrogant resistors that needs to be cleansed, if need be, by management intervention. For the IT team the six sigma team will look like dangerously ignorant meddlers, lacking all process knowledge and trying to impose a completely irrelevant methodology.

So, how can we avoid this mess? The process improvement teams need to be flexible here and recognize that the standard DMAIC is of a very limited use in the IT context. Luckily, we have the Lean part in Lean Six Sigma and consistently applying it will solve our problem. Lean is known to have little regard for statistics – which is helpful here. This does NOT mean that Lean Thinking involves giving up rigorous thinking – it just means that we would rely much less on data and much more on process knowledge of the IT teams. Just to illustrate my point – if the goal of the process is to reduce resolution times should we spent even a minute agonizing over the question whether a Weibull or a loglogistic distribution is a better fit for our data? Whether the Kruskal-Wallis test will be significant or not ? My answer would be – as you probably guessed – NO. The answers for an IT improvement are in 99% of the cases in the process – and the IT team members know the process in all details, so let us ask them and forget the statistical niceties.

We still have a LOT to contribute as process improvers: methods to do process maps, the seven types of wastes, rigorous root cause analysis, PDCA cycles and visual management – all this is a welcome addition to the toolset of an IT department. It also allows us and them to bring our relative strengths into the improvement process, based on mutual acceptance and respect. And this is what we want to achieve . It is a thousand times better outcome then a beautiful zero inflated regression model, though as a statistics fan, I do feel some residual pain is saying so.