50% = I have no idea

Benjamin’s recent post about differences risk intelligence between users of various web browsers drew a lot interest after it was featured on Slashdot. Among the many emails we received, one in particular caught my attention, because it articulated very clearly a common reaction to the whole idea of risk intelligence.

The email noted that our risk intelligence test presents users with a scale with one end marked 0% (false) and the other marked 100 (%true), and objected that this implied there was no option for “don’t know.” The email went on to reason (correctly) that “therefore the only logical choice that can be made in the case of not knowing the answer is 50%.”

So, what’s the problem? The instructions for the test state clearly that if you have no idea at all whether a statement is true or false, you should click on the button marked 50%. So why did the author of this email state that there was no option for “don’t know”?

I think the problem may lie in the fact that, while 50% does indeed mean “I have no idea whether this statement is true or false,” it does not necessarily mean “I have no information.” There are in fact two reasons why you could estimate that a statement had 50% chance of being true:

1. You have absolutely no information that could help you evaluate the probability that this statement is true; OR

2. You have some information, but it is evenly balanced between supporting and undermining the statement

So, maybe that’s what the email was getting at. But even if this interpretation is correct, it doesn’t justify the claim that there is no option for “don’t know.” There is. It’s the 50% option. That’s what 50% means in this context.

The email went on to add: “It’s very curious that you use a scale; surely someone either believes that they know the correct answer or they don’t know the correct answer. I can’t see that there is any point in using a scale. I would think it far more sensible to present three options of True, False or Pass.”

But this simply begs the question. As I pointed out in a previous post, one of the most  revolutionary aspects of risk intelligence is that it challenges the widespread tendency to think of proof, knowledge, belief, and predictions in binary terms; either you prove/know/believe/predict something or you dont, and there are no shades of gray in between. I call this “the all-or-nothing fallacy,” and I regard it as one of the most stupid and pernicious obstacles to clear thinking.

Why should proof, or knowledge, or belief require absolute certainty? Why should predictions have to be categorical, rather than probabilistic? Surely, if we adopt such an impossibly high standard, we would have to conclude that we can’t prove or know anything at all, except perhaps the truths of pure mathematics. Nor could we be said to believe anything unless we are fundamentalists, or predict anything unless we are clairvoyant. The all-or-nothing fallacy renders notions such as proof, belief, and knowledge unusable for everyday purposes.

In 1690, the English philosopher John Locke noted that “in the greatest part of our concernments, [God] has afforded us only the twilight, as I may so say, of probability.” Yet, as emails like this show, we are still remarkably ill equipped to operate in this twilight zone.

Internet Explorer users have low Risk Intelligence (RQ)

A hoax report earlier this year claimed that people who used Internet Explorer had a lower IQ than those using other browsers. Inspired by this bit of fun, Projection Point decided to carry out a poll to compare the risk intelligence (RQ) of people using different browsers. We found that Internet Explorer users performed worse than everyone else; they had lower RQ scores and were grossly overconfident.

We define Risk Intelligence as the ability to estimate probabilities accurately. Our Basic RQ Test consists of fifty statements—some true, some false—and your task is to say how likely you think it is that each statement is true. It’s a simple process; if you are absolutely sure that a statement is true, you assign a probability of 100 percent to it. If you are convinced that a statement is false, you should assign it a probability of 0 percent. If you have no idea at all whether it is true or false, you should rate it as 50 percent probable. If you are fairly sure that it is true but you aren’t completely sure, you would give it 60 percent, 70 percent, 80 percent, or 90 percent, depending on how sure you are. Conversely, if you are reasonably confident that it is false but you aren’t completely sure, you would give it 40 percent, 30 percent, 20 percent, or 10 percent.

When you have estimated the likelihood of all fifty statements in the test, the website will calculate your risk intelligence quotient, or RQ, a number between 0 and 100. Although our small sample size of 351 participants does not permit strong conclusions, they do suggest an interesting possibility; users of monopoly software (that historically has been responsible for many of the most severe software vulnerabilities) are not as good at estimating probabilities as their more adventurous counterparts. Perhaps the use of Microsoft Internet Explorer should be considered an indicator of poor risk intelligence. This would be consistent with studies showing that the computers of Internet Explorer users contain more malicious software than the machines of those using other browsers, that about 7% of downloads by Internet Explorer users are malicious and that the browser is amongst the most popular means of infecting Windows machines (this holds especially true for older versions). Although Microsoft’s efforts are slowly changing vulnerability trends for the better, these findings should come as no surprise given the company’s attention to security in the past: “Many of the products we designed […] have been less secure than they could have been because we were designing with features in mind rather than security. […] In the past we sold new applications on the strength of new features, most of which people didn’t use.” – Chief Research and Strategy Officer at Microsoft, Craig Mundie (2002).

Right now it looks like Apple users are the best when it comes to dealing with risk, a skill that should come in quite handy considering that Mac OS X was the first system to go down during the Pwn2Own hacking contest of 2011. But only time, a larger sample size and careful scrutiny may validate our observations.


The test can be found at: http://www.projectionpoint.com/
A mobile version of the test for Android and iPhones can be found here.

Baseball, sabermetrics and risk intelligence

I’ve just been to see Moneyball, a new film based on the eponymous 2003 book by Michael Lewis. It tells the story of how Billy Beane, the general manager of the Oakland Athletics, led the team to a series of 20 consecutive wins in the 2002 baseball season, an American league record. This feat was apparently due to Beane’s use of sabermetricsMoneyball Poster

Sabermetrics is the application of statistical techniques to determining the value of baseball players. The term is derived from the acronym SABR, which stands for the Society for American Baseball Research. It was coined by Bill James, who began developing the approach while doing night shifts as a security guard at the Stokely Van Camp pork and beans cannery in the 1970s.

The drama revolves around the tension between Beane and the team’s scouts, who are first dismissive of, and then hostile towards, his statistical approach.  Rather than relying on the scouts’ experience and intuition, Beane selects players based almost exclusively on their on base percentage (OBP). By finding players with a high OBP but characteristics that lead scouts to dismiss them, Beane assembles a team of undervalued players with far more potential than the Athletics’ poor finances led people to expect.

There’s something very satisfying about seeing the scouts’ boastful claims about their expertise being undermined by newcomers with a more evidence-based approach. The same thing is occurring in other fields too, such as wine-tasting. In the 1980s, the economist Orley Ashenfelter found that he could predict the price of Bordeaux wine vintages with a model containing just three variables: the average temperature over the growing season, the amount of rain during harvest-time, and the amount of winter rain. This did not go down well with the professional wine tasters who made a fine living by trading on their expert opinions. All of a sudden, Ashenfelter’s equation threatened to make them obsolete, just as sabermetrics did with the old-fashioned scouts.

It would be wrong to conclude, however, that we can do away with intuition altogether. For one thing, you need lots of data and time to build reliable statistical models, and in the absence of these resources you have to fall back on intuition. If you have low risk intelligence, you’ll be screwed.

Secondly, risk intelligence is required even when sophisticated models and supercrunching computers are in plentiful supply. An overreliance on computer models can drown out serious thinking about the big questions, such as why the financial system nearly collapsed in 2007–2008 and how a repeat can be avoided. According to the economist Robert Shiller, the accumulation of huge data sets in the 1990s led economists to believe that “finance had become scientific.” Conventional ideas about investing and financial markets—and about their vulnerabilities—seemed out of date to the new empiricists, says Shiller, who worries that academic departments are “creating idiot savants, who get a sense of authority from work that contains lots of data.” To have seen the financial crisis coming, he argues, it would have been better to “go back to old-fashioned readings of history, studying institutions and laws. We should have talked to grandpa.”

Risk management is a complex process that requires both technical solutions and human skill.  Mathematical models and computer algorithms are vital, but such technical solutions can be useless or even dangerous in the hands of those with low risk intelligence.

Scenario planning and probabilities

First, a caveat; I don’t know much about scenario planning, so the following comments may come across as rather simplistic to those well versed in this area.  Also, it is probably rather presumptuous to be so critical of something I know so little about. So consider this post as an opening gambit rather than a considered conclusion.

I recently exchanged a few emails with a guy who does scenario planning for a non-profit organization.  When I asked him if he got people to attach numerical probabilities to each scenario, he replied: “We don’t do probabilities, but instead run workshops and interviews to get a sense of where people’s mental models are in terms of how things might turn out.” The problem with this is that weasel word, “might,” which could mean anything from “extremely unlikely” to “almost certain”.

For example, suppose the folks at the Pentagon are mapping out possible scenarios that might follow a US invasion of Syria, such as:

  • Invasion is successful with minimal human and financial cost, Syrians welcome the troops and quickly set up a prosperous, democratic and liberal society that becomes a strong US ally and force for positive change in the Islamic world.
  • Invasion is a complete disaster with massive cost and casualties, resulting in a devastated Syria split into violent fiefdoms, including one run by Assad and another by Al Qaeda. The US is humiliated both militarily and by revelation of major scandals and atrocities. Many US troops are prisoners.

Plus various intermediate possibilities.

So far, so good.  The precise details in each scenario are not that important, since the scenarios are really just placeholders for a set of outcomes arranged in order of preference.  The real problems begin when we go from scenarios to decisions. For unless we have some idea of how likely each scenario is, it will be impossible to assess the expected utility of various mitigating strategies.  There may be no point in spending billions of dollars to avert a worst-case scenario if the probability of that scenario occurring is very low.

The Intergovernmental Panel on Climate Change (IPCC) attaches numerical probabilities to various scenarios it discusses in its reports. Everyone else who does scenario planning should do the same.

Uncertainty intolerance on the trading floor

When markets are uncertain of the outcome of an important scheduled event, like an election or the release of an economic statistic, expected volatility increases. Therefore, the implied volatility, i.e., the estimate of future variance implied by an option’s price, is a sensitive barometer of the market’s collective uncertainty.

It is interesting to note, therefore, that a study published in Philosophical Transactions of the Royal Society B: Biological Sciences in 2010 found that average cortisol levels in a group of 17 male traders in a midsized trading floor in the City of London correlate strongly with implied volatility. These traders experienced acutely raised cortisol in anticipation of higher volatility, which implies that they found uncertainty highly stressful. Psychologists would say that they have high levels of uncertainty intolerance.

I have a strong hunch that uncertainty intolerance tends to undermine risk intelligence by, for example, leading people to see things too starkly in black or white terms, and reacting to ambiguity with feelings of uneasiness, discomfort, dislike, anger, and anxiety that intrude on rational assessment. Getting a fix on your own degree of uncertainty tolerance is, therefore, an important step in improving your risk intelligence. It’s worrying, therefore, that in the questionnaires they filled in, the traders in this study displayed no awareness of of the rampant stress indicated by their cortisol measurements.  Not only were they freaked out by uncertainty, but they didn’t even know they were freaked out.