The average result versus the calculation based on average inputs

 One of the most common risk errors is to do a computation assuming average values for uncertain inputs and treat the result as the average outcome.

For example, suppose we have a fair coin, that is one that has a 50 percent chance of flipping heads and a 50 percent chance of flipping tails, with each flip independent of prior flips. The probability of flipping one head is, of course, 50 percent. The probability of flipping nine heads in a row is 1/512 or 0.2 percent.

Suppose instead you hold a coin for which you have no idea of the probability of flipping heads. You think that probability is equally likely to be any number between 0 percent and 100 percent. The chance of flipping one head is still 50 percent. But the probability of flipping nine heads in a row is now 10 percent, not 0.2 percent. The reason is that if the coin has a high probability of flipping heads, say 95 percent, the chance of getting nine heads in a row is 63 percent while if the coin has a low probability of flipping heads, say 5 percent, the chance of getting nine heads in a row is 0.0000000002 percent. Thus the high probability of heads coins, the 95 percent’s, add much more to the 0.2 percent probability than the low probability of heads coins, the 5 percent’s, take away.

For a practical example, consider a proposed government program that will tax 0.1 percent of gross domestic product (GDP) to fund some useful service. The real (that is, after inflation) tax revenues will increase with real GDP growth, the real program costs will also increase at some rate. Let’s suppose we project average real growth rates for both real GDP and program real costs are three percent per year.

If we assume both growth rates are exactly three percent per year, the program will cost 0.1 percent of GDP. But suppose we instead assume there is some future uncertainty about the growth rates, that each month the rates can be 0.05 percent higher or lower than the previous month. So in the first month, the real GDP growth rate might be 2.95 percent / 12 or 3.05 percent / 12, and the same for the real program costs. Some factors will make the growth rates positively correlated, for example expanding population will generally increase both GDP and program costs. Other factors will argue for negative correlation, for example bad economic times mean low GDP growth and increased needs for government expenditures. We assume the changes in the two growth rates are independent, the positive and negative correlations offset.

The expected cost of this program is almost 0.2 percent of GDP, not 0.1 percent. Both average growth rate assumptions were correct, but the projected total cost was wildly incorrect. Like the coin, the reason is the asymmetry in costs. If GDP growth is slow and program costs rise quickly, the cost can easily be one percent of GDP or more. In the reverse circumstance, rapid GDP growth and slow growth in program costs, the program costs will likely be something like 0.03 percent or 0.04 percent of GDP. The high scenarios add more to the 0.1 percent projected cost than the low scenarios can subtract. In this case, 11% of the time the program costs come in under half the expected 0.1 percent level, with an average of 0.043 percent. 23 percent of the time the program costs come in over twice projection, with an average cost of 0.529 percent of GDP.

Or think about a project with a number of inter-related steps. Some will come in early and below budget, others will come in late and above budget. But the early steps won’t reduce total project time much because we usually can’t push up scheduling of later steps. We know, however, that the late steps will delay things, often causing cascading delays so a week late in one step can mean months late to the final deliverable. Also, it’s hard to save more than 10 or 20 percent in a step, but it’s easy to go 100 percent or 200 percent over budget.

People often do good-expected-bad case analyses to account for these effects, but these seldom capture the effect of genuine uncertainty. Within each good-expected-bad scenario, everything is certain. Beware of any calculation that substitutes averages (or even good, expected and bad values) for uncertain inputs. Your actual results are likely to be worse than the projections.

2 thoughts on “The average result versus the calculation based on average inputs

  1. Depends what kind of average your talking about, doesn’t it? Suppose you assume your best guess (for costs, tax revenue, etc) is the geometric mean rather than the arithmetic mean. That in itself would tend to lead you to make your upper bound the same factor away from your best guess as the lower bound. This would overcome the problem you identify, wouldn’t it?

  2. It might. In a theoretical situation you can compute mathematically the expectation of any action. I did that with the coin.

    In some practical cases, you can do this as well. Using a geometric mean corrects for situations in which a 10% increase in something hurts more than a 10% decrease hurts. In some cases that gets close to the correct answer, in other cases it might signficantly overcorrect or undercorrect.

    Or using the project example, it is possible to design projects to be robust against problems in any step, and to be able to take advantages of positive surprises. Using a geometric mean for expected cost and completion time of each step is probably more realistic than using an arithmetic mean.

    But rather than putting your faith in one kind of average versus another, I prefer an explicit consideration of the range of possible outcomes. No single number conveys that properly for all purposes.

Leave a Reply

Your email address will not be published. Required fields are marked *



You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>