zmVault/hubbard_2020_failure.md

---
title: The Failure of Risk Management
tags:
  - authorship/other
  - exclude-from-word-count
  - topic/risk
  - type/media/book
author: Douglas W. Hubbard
edition: Second
publisher: John Wiley & Sons
subtitle: Why It's Broken and How to Fix It
type: book
year: 2020
---
# The Failure of Risk Management

%%
This note, with the exception of comments like this one
(reserved for notes on transcription)
consists only of content from the text.
For commentary see the companion
[[the-failure-of-risk-management]].
%%

## Part One: An Introduction To The Crisis

### Chapter 1: Healthy Skepticism For Risk Management

#### A "Common Mode Failure"

#### Key Definitions: Risk Management And Some Related Terms

#### What Failure Means

#### Scope And Objectives Of This Book

#### Notes

### Chapter 2: A Summary Of The Current State Of Risk Management

#### A Short And Entirely-Too-Superficial History Of Risk

#### Current State Of Risk Management In The Organization

#### Current Risks And How They Are Assessed

#### Notes

### Chapter 3: How Do We Know What Works?

#### Anecdote: The Risk Of Outsourcing Drug Manufacturing

#### Why It's Hard To Know What Works

#### An Assessment Of Self-Assessments

#### Potential Objective Evaluations Of Risk Management

#### What We May Find

#### Notes

### Chapter 4: Getting Started: A Simple Straw Man Quantitative Model

#### A Simple One-For-One Substitution

#### The Expert As The Instrument

#### A Quick Overview Of "Uncertainty Math"

#### Establishing Risk Tolerance

#### Supporting The Decision: A Return On Mitigation

#### Making The Straw Man Better

#### Note

## Part Two: Why It's Broken

### Chapter 5: The "Four Horsemen" Of Risk Management: Some (Mostly) Sincere Attempts To Prevent An Apocalypse

#### Actuaries

#### War Quants: How World War II Changed Risk Analysis Forever

#### Economists

#### Management Consulting: How A Power Tie And A Good Pitch Changed Risk Management

#### Comparing The Horsemen

#### Major Risk Management Problems To Be Addressed

#### Notes

### Chapter 6: An Ivory Tower Of Babel: Fixing The Confusion About Risk

#### The Frank Knight Definition

#### Knight's Influence In Finance And Project Management

#### A Construction Engineering Definition

#### Risk As Expected Loss

#### Defining Risk Tolerance

#### Defining Probability

#### Enriching The Lexicon

#### Notes

### Chapter 7: The Limits Of Expert Knowledge: Why We Don't Know What We Think We Know About Uncertainty

#### The Right Stuff: How A Group Of Psychologists Might Save Risk Analysis

#### Mental Math: Why We Shouldn't Trust The Numbers In Our Heads

#### "Catastrophic" Overconfidence

#### The Mind Of "Aces": Possible Causes And Consequences Of Overconfidence

Unless managers take steps to offset overconfidence
in assessments of probabilities,
they will consistently underestimate various risks
(i.e., they will be more confident than they should be
that some disaster won't occur).
This may have had some bearing on very-high-profile disasters,
such as those of the Space Shuttle Orbiters *Challenger* and *Columbia*.

The Nobel Prize-winning physicist, Richard Feynman,
was asked to participate in the investigation of the first Space Shuttle accident
(involving *Challenger*).

What he found were some risk assessments that seemed at first glance to be obviously optimistic.
He noted the following in the *Rogers Commission Report on the Space Shuttle* Challenger *Accident*:

> It appears that there are enormous differences of opinion
> as to the probability of a failure with loss of vehicle and of human life.
> The estimates range from roughly 1 in 100 to 1 in 100,000.
> The higher figures \[1 in 100\] come from the working engineers
> and the very low figures \[1 in 100,000\] from management.
> What are the causes and consequences of this lack of agreement?
> Since 1 part in 100,000 would imply
> that one could put a Shuttle up each day for 300 years
> expecting to lose only one,
> we could properly ask
> "What is the cause of management's fantastic faith in the machinery?"[^7-10]

Feynman believed that if management decisions to launch
were based on such an extraordinary confidence in the Shuttle,
then these decisions were flawed.
As was Feynman's frequent practice,
he applied simple tests and reality checks
that would cast doubt on these claims.

Perhaps an obvious explanation is the conflict of interest.
Are managers really incentivized to be honest with themselves and others
about these risks?
No doubt, that is a factor,
just as it was probably a factor in the assessments of risks
taken by bank managers in 2008,
whether or not it was consciously considered.
However, individuals showed overconfidence
even in situations when they had no stake in the outcome (trivia tests, etc.).

JDM research has shown that both the incentives and the amount of effort put into identifying possible surprises will make a difference in overconfidence.[^7-11]
Some of the sources of overconfidence would affect
not only managers who depend on subjective estimates
but also those who believe they are using sound analysis of historical data.
Managers will fail to consider ways in which human errors affect systems and will fail to consider common mode and cascade system failures.[^7-12]

There may also a tendency to relax our concerns
for infrequent but catastrophic events
when some time passes without experiencing the event.
Robin Dillon-Merrill,
a decision and risk analysis professor at Georgetown University,
noticed this tendency when she was studying
the risk perceptions of NASA engineers
prior to the Space Shuttle *Columbia* accident.
*The* Columbia *Accident Investigation Report* noted the following:

> The shedding of External Tank foam---
> the physical cause of the Columbia accident---had a long history.
> Damage caused by debris has occurred on every Space Shuttle flight,
> and most missions have had insulating foam shed during ascent.
> This raises an obvious question:
> Why did NASA continue flying the Shuttle
> with a known problem that violated design requirements?[^7-13]

Dillon-Merrill considers each time
that foam fell off the external tank of the Shuttle,
but where the Shuttle still had a successful mission,
to be a "near miss."
Her proposal was that near misses are an opportunity to learn
that is rarely exploited.
She interviewed NASA staff and contractors
about how they judged near misses
and found two very interesting phenomena
that in my opinion have important implications
for risk management in general.

Perhaps not surprisingly, she found that near misses and successes
were both judged much more favorably than failures.
But were these near-miss events being rated more like a failure
than a mission success?
Did engineers take each near miss as a red flag warning
about an impending problem?
Incredibly, just the opposite occurred.
The study included an experiment
where NASA staff and students
were asked to choose among various options
for hypothetical unmanned space missions.
The options included decisions like whether to skip a test
due to schedule constraints.
Some subjects were given near miss data and some were not.
The study found that people with the near-miss information were *more* likely to choose the riskier alternative.[^7-14]

> *The near miss interpretation paradox*:
> People with near-miss information
> were *more* likely to make the riskier choice
> than people who did not have information about near misses.

It is possible that managers were looking at each near miss
and thinking that because nothing had happened yet,
perhaps the system was more robust than they thought.
Or it might be more subtle than that.
Dillon-Merrill found that when people have a known exposure
to some relatively unlikely risk,
their tolerance for that risk seems to increase
even though they may not be changing their estimate
of the probability of the risk.

Imagine that you are in an area exposed to hurricane risks.
Authorities confirm that there is a 3 percent chance of injury or death
each time you do not evacuate when ordered to for a hurricane warning.
If you happen to make it through two or three hurricanes without harm,
you will become more tolerant of that risk.

Note that you are not actually changing your estimate
of the probability of the harm
(that was provided by authorities);
you are simply becoming more numb to the risk as it is.

Now imagine the implications of this for Wall Street.
If they have a few good years,
everybody will start to become more "risk tolerant"
even if they are not changing their underlying forecasts
about the probabilities of a financial crisis.
Now that the mortgage uncertainty has settled for a decade or so,
will all managers, again, start to become more tolerant of risks?

There are other effects to consider
when examining the psyche of upper-level decision-makers.
Part of overestimating past performance
is due to the tendency to underestimate how much we learned
in the last big surprise.
This is what Slovic and Fischhoff called the *I-knew-it-all-along* phenomenon.
People will exaggerate how "inevitable" the event would have appeared
before the event occurred.
(News pundits talking about the mortgage crisis
certainly make it sound as if it were inevitable,
but where were they before the crisis occurred?)

They even remember their previous predictions in such a way that they,
as Slovic put it, "exaggerate in hindsight what they knew in foresight."
I hear the I-saw-that-coming claim so often that, if the claims were true,
there would be virtually no surprises anywhere in the world.
Two lines of dialog in the movie *Wall Street*
revealed Oliver Stone's grasp of this phenomenon.
After "Bud" (Charlie Sheen's character) had his initial big successes as a broker,
his boss said, "The minute I laid eyes on you, I knew you had what it took."
Later, when Bud was being arrested in the office
for the crimes he committed to get those early successes,
the same boss said, "The minute I laid eyes on you, I knew you were no good."
Kahneman sums it up:

> When they have made a decision,
> people don't even keep track of having made the decision or forecast.
> I mean, the thing that is absolutely the most striking
> is how seldom people change their minds.
> First, we're not aware of changing our minds
> even when we do change our minds.
> And most people, after they change their minds,
> reconstruct their past opinion---
> they believe they *always* thought that.[^7-15]

There is one other item about overconfidence
that might be more unique to upper management
or particularly successful traders.
Some managers can point to an impressive track record of successes
as evidence that a high level of confidence on virtually all matters
is entirely justified on their part.
Surely, if a portfolio manager can claim
she had above-average market returns for five years,
she must have some particularly useful insight into the market.
An IT security manager who has presided over a virus-free,
hacker-free environment much longer than his peers in other companies
must have great skill, right?

Actually, luck can have more to do with success
than we might be inclined to think.
For example, a statistical analysis of World War I aces
showed that Baron von Richthofen (aka The Red Baron)
might have been lucky but not necessarily skilled.[^7-16]
Two electrical engineering professors,
Mikhail Simkin and Vwani Roychowdhury
of the University of California at Los Angeles,
examined the victories and losses
for the 2,894 fighter pilots who flew for Germany.
Together, they tallied 6,759 victories and 810 defeats.
This is perhaps a suspiciously high win ratio
but these numbers include shooting down unarmed scout and delivery planes.
The Germans also had a technological advantage in the air during WWI.
Furthermore, not all kills could be confirmed
and the inflation of these numbers is certainly possible---
but there is no reason to assume
that the Baron was less prone to exaggeration than others.
Simkin and Roychowdhury showed that,
given the number of pilots and the win ratio,
there was about a 30 percent chance that, by luck alone,
one pilot would have gotten eighty kills,
the number Manfred von Richthofen is credited for.

This might describe a large number of "successful" executives
who write popular books on the special insight they brought to the table,
but who then sometimes find they are unable to repeat their success.
Given the large number of candidates
who spend their careers competing
for a small number of upper-management positions,
it is likely that some will have a string of successes just by chance alone.
No doubt, some of these will be more likely
to hold upper-management positions.
In the same manner,
some will also have a string of successes in a coin-flipping tournament
in which there are a large number of initial players.
But we know that the winners of this kind of contest
are not just better coin-flippers.
Sure, there is probably some skill in reaching upper management.
But how much of it was more like winning a coin-flipping contest?

#### Inconsistencies And Artifacts: What Shouldn't Matter Does

#### Answers To Calibration Tests

#### Notes

[^7-1]: D. Kahneman and G. Klein "Conditions for Intuitive Expertise: A Failure to Disagree," *American Psychologist* 64, no. 6 (2009): 515--26.

[^7-2]: A. H Murphy and R. L Winker, "Can Weather Forecasters Formulate Reliable Probability Forecasts of Precipitation and Temperature?," *National Weather Digest* 2 (1977): 2--9.

[^7-3]: D. Kahneman and A. Tversky, "Subjective Probability: A Judgment of Representativeness," *Cognitive Psychology* 3 (1972): 430--54.

[^7-4]: G. S Tune, "Response Preferences: A Review of Some Relevant Literature," *Psychological Bulletin* 61 (1964): 286--302.

[^7-5]: W. Feller, *An Introduction to Probability Theory and Its Applications* (New York: Wiley, 1968), 160.

[^7-6]: E. Johnson, "Framing, Probability Distortions and Insurance Decisions," *Journal of Risk and Uncertainty* 7 (1993): 35.

[^7-7]: D. Kahneman and A. Tversky, "Subjective Probability: A Judgment of Representativeness," *Cognitive Psychology* 4 (1972): 430--54.

[^7-8]: D. Kahneman and A. Tversky, "On the Psychology of Prediction," *Psychological Review* 80 (1973): 237--51.

[^7-9]: A. Tversky and D. Kahneman, "The Belief in the 'Law of Small Numbers,'" *Psychological Bulletin* 76 (1971): 105--10.

[^7-10]: R. Feynman, "Personal Observations on the Reliability of the Shuttle," Appendix IIF.  In William Rogers et al., *Space Shuttle Accident Report* (Washington, DC: US GPO, 1986).

[^7-11]: A. Koriat, S. Lichtenstein, and B. Fischhoff, "Reasons for Confidence," *Journal of Experimental Psychology: Human Learning and Memory* 6 (1980): 107--18.

[^7-12]: P. Slovic, B. Fischhoff, S. Lichtenstein, *Societal Risk Assessment: How Safe Is Safe Enough?* (New York: Plenum Press, 1980).

[^7-13]: *The* Columbia *Accident Investigation Board Report*, Vol.  I (Washington, DC: US GPO, 2003), 121.

[^7-14]: R. Dillon and C. Tinsley, "How Near-Misses Influence Decision Making under Risk: A Missed Opportunity for Learning," *Management Science* 54, vol. 8 (January 2008): 1425--40.

[^7-15]: M. Schrage, "Daniel Kahneman: The Leader Interview," *strategy + business*, <https://www.strategy-business.com/article/03409?gko=7a903%2031.12.2003>.

[^7-16]: M. Simkin and V. Roychowdhury, "Theory of Aces: Fame by Chance or Merit?," *Journal of Mathematical Sociology* 30, no. 1 (2006): 33--42.

[^7-17]: E. Brunswik, "Representative Design and Probabilistic Theory in a Functional Psychology," *Psychological Review* 62 (1955): 193--217.

[^7-18]: C. M Kuhnen and B. Knutson, "The Neural Basis of Financial Risk Taking," *Neuron*, 47 (2005): 763--70.

[^7-19]: P. Aldhous, "Cheery Traders May Encourage Risk Taking," *New Scientist* (April 7, 2009).

[^7-20]: Paolo Sapienza, Luigi Zingales, and Dario Maestripieri, "Gender Differences in Financial Risk Aversion and Career Choices Are Affected by Testosterone," *Proceedings of the National Academy of Sciences of the United States of America* 106, no. 36 (2009).

[^7-21]: J. S Lerner and D. Keltner, "Fear, Anger, and Risk," *Journal of Personality & Social Psychology* 81, no. 1 (2001): 146--59.

### Chapter 8: Worse Than Useless: The Most Popular Risk Assessment Method And Why It Doesn't Work

#### A Few Examples Of Scores And Matrices

#### Does That Come In "Medium"?: Why Ambiguity Does Not Offset Uncertainty

#### Unintended Effects Of Scales: What You Don't Know Can Hurt You

#### Different But Similar-Sounding Methods And Similar But Different-Sounding Methods

#### Notes

### Chapter 9: Bears, Swans And Other Obstacles To Improved Risk Management

#### Algorithm Aversion And A Key Fallacy

#### Algorithms Versus Experts: Generalizing The Findings

#### A Note About Black Swans

The _exsupero ursus_ fallacy is reinforced by authors of very popular books
who seem to depend heavily on some version of the fallacy.
One such author is former Wall Street trader and mathematician Nassim Taleb.
He wrote _The Black Swan_
and other books critical of common practice in risk management,
especially in (but not limited to) the financial world,
as well as the nonquantitative hubris of Wall Street.

A heretic of financial convention,
he argues that Nobel Prize-winning modern portfolio theory and options theory
(briefly mentioned in chapter 5)
are fundamentally flawed and are in fact no better than astrology.
In fact, Taleb considers this prize is itself an intellectual fraud.
After all, as he rightly points out,
it was not established in the will of Alfred Nobel,
but by the Royal Bank of Sweden seventy-five years after Nobel's death.
He even claims that once, in a public forum,
he riled up one such prizewinner to the point of red-faced, fist-pounding anger.

Taleb bases a lot of his thesis on the fact that the impact of chance
is unappreciated by mostly everyone.
He sees the most significant events in history as being completely unforeseeable.
He calls these events _black swans_ in reference to an old European expression
that went something like "That's about as likely as finding a black swan."
The expression was based on the fact that no European
had ever seen a swan that was black---until Europeans traveled to Australia.
Until the first black swans were sighted, black swans were a metaphor for impossibility.
Taleb puts September 11, 2001, stock market crashes, major scientific discoveries,
and the rise of Google in his set of black swans.
Each event, he argues, was not only unforeseen
but _utterly unforeseeable_ based on our previous experience.
People will routinely confuse luck with competence
and they will presume that the lack of seeing an unusual event to date
is somehow proof that the event cannot occur.

Managers, traders, and the media seem to be especially susceptible to these errors.
Out of a large number of managers,
some managers will have made several good choices in a row by chance alone.
This is what I called the Red Baron effect in a previous chapter.
Such managers will see their past success
as indicators of competence and, unfortunately,
will act with high confidence on equally erroneous thinking in the future.
Taleb recognizes the problems of overconfidence researched by Kahneman and others.
Indeed, Taleb says Kahneman is the only Economics Nobel Prize winner he respects.

I think part of Taleb's skepticism is refreshing and on point.
I agree with many of Taleb's observations on the misplaced faith in some models
and will discuss this further in the next chapter.
I might even include Taleb as one source of inspiration
for identifying new categories of fallacies
(and giving it a Latin name in order to sound official).
Taleb coined a fallacy he refers to as the _ludic fallacy_,
derived from the Latin word for "games of chance."
Taleb defines the ludic fallacy as the assumption that the real world
necessarily follows the same rules as well-defined games of chance.

Now, here is where Taleb errs.
He doesn't just argue that risk management is flawed.
He argues that risk management itself is impossible
and that all we can do is make ourselves _antifragile_.
I think he is just using a very different definition of risk management---
which even he uses inconsistently.
No matter what he calls it, he is promoting a particular set of (vaguely defined) methods
that have the objective of reducing risk.
This reduction in risk will require resources.
Using the definition I propose in [chapter 6],
determining how to use resources to reduce risk is part of risk management.
He actually contradicts himself on this point
when he promotes redundancy as a method of becoming antifragile
and refers to it as the "central risk management property of natural systems."
So, yes, we are both talking about risk management.
He focuses on particular approaches to it, but it is risk management just the same.

Confusion and inconsistency about whether managing fragility is, in practice,
part of managing risks is not the only problem in his thesis.
Taleb commits every form of the _exsupero ursus_ fallacy
throughout most of what he writes.

Specifically,
(1) he presumes the lack of perfection of one model
automatically necessitates use of the other regardless of relative performance,
(2) he commits the anecdotal fallacy
when looking for evidence of relative performance, and
(3) he presumes that a given model was even being used
when he identifies them as the culprit in major risk events.

In an interview for _Fortune_ Taleb claimed,
"No model is better than a faulty model."
Again, having no model is never an option.
One way or another, a model is being used.
Taleb's model is simply his common sense,
which is, as Albert Einstein defines it,
"merely the deposit of prejudice laid down in the human mind before the age of eighteen."
As with every other model, common sense has its own special errors.

We've seen the research that shows overwhelming evidence
of the flaws of unaided intuition compared to even simple statistical models,
and Taleb offers no empirical data to the contrary.
Taleb does briefly mention the work of Meehl but dismisses it.
Without making any mention of the huge numbers
of conclusive results by Meehl and his colleagues,
Taleb claims the entire body of research is invalid
by claiming "that these researchers did not have a clear idea
of where the burden of empirical evidence lies"
and goes on to suggest that they lacked "rigorous empiricism."
He offers no details about how more than one hundred peer-reviewed,
published studies by several researchers veers from the required rigorous empiricism.

Kahneman, who actually is a psychologist like Meehl,
would apparently disagree with Taleb on Meehl's methods.
Taleb considers Kahneman a significant influence on his work,
but who does Kahneman consider to be a significant influence on his work?
<u>Meehl</u>.
I wouldn't presume to speak for Kahneman
but I wonder if he might point out to Taleb
how the burden of proof was accepted and met overwhelmingly by Meehl,
whereas Taleb's evidence merely amounts to, at best,
selected anecdotes of shortcomings or entirely imagined straw man arguments.
Taleb even sometimes cites the work of Phil Tetlock
to support some other point he makes
but never references Tetlock's enormous twenty-year study
where he concluded that it was "impossible"
to find a domain where humans clearly outperformed algorithms.

Instead of relying on large controlled studies,
Taleb commits the error of arguing that single events
effectively disprove a probabilistic model.
He uses the apparent unforeseeability of specific events
as evidence of a flaw in risk analysis.
The implication is that if quantitative analysis worked,
then we could make exact predictions of specific and extraordinary events
such as 9/11 or the rise of Google.
When arguing against the use of various statistical models in economics
he states that "the simple argument that Black Swans and tail events
run the socioeconomic world---and these events cannot be predicted---
is sufficient to invalidate their statistics."[^09-12]
Yes, the rare events---black swans---
are individually impossible to predict precisely.
But unless he can show that his alternative model (apparently his intuition)
would also have predicted such events exactly,
then he commits _exsupero ursus_ when he says imperfection alone
is sufficient to prefer intuition over statistics.

In addition to Kahneman,
it is worth pointing out others whose work Taleb cites to make a point but who,
if you actually looked at what they are doing, would contradict Taleb.
For example, Taleb says he admires the mathematician Edward Thorp,
who developed a mathematically sound basis
for card counting in blackjack in the 1960s.
Now, if the objective of card counting was to predict every hand,
even the most extraordinarily rare combinations as Taleb would seem to require,
then Ed Thorp's method certainly fails.
But Ed Thorp's method works---that's why the casinos quit letting him play---
because his system resulted in better bets on average
after a large number of hands.
Taleb is also a fan of the mathematician Benoit Mandelbrot,
who used the mathematics of _fractals_ to model financial markets.
Similar to Thorp and Taleb,
Mandelbrot was equally unable to predict specific extraordinary events exactly,
but his models are preferred by some
because they seem to generate more realistic patterns
that look like they _could_ be from real data.

If anecdotal evidence were sufficient to compare model performance,
one could simply point out that Taleb's investment firm, Empirica Capital LLC,
closed in 2004 after several years of mediocre returns.[^09-13]
He had one very good year in 2000 (a 60 percent return)
because while everyone else was betting on dot-com, he bet on _dot-bomb_.
But the returns the following years were far enough below the market average
that the good times couldn't outweigh the bad for his fund.

Similar to the news pundits rejecting Nate Silver's findings
or the sportscasters rejecting the methods used by the Oakland A's,
Taleb merely shows that it is possible to find an error in a model
if one looks hard enough.
Again, the question is not whether to model (intuition is a model, too)
or whether one model is imperfect (both models are imperfect)
but which measurably outperforms the other
and does so in many trials not just single anecdotes.

Finally, Taleb makes the error of presuming what methods
were actually being used when he blames them for an event.
He argues, for example,
that the downfall of long-term capital management (LTCM)
disproves options theory.
Recall that options theory won the Nobel Prize
for Robert Merton and Myron Scholes,
both of whom were on the board of directors for LTCM.
The theory was presumably the basis of the trading strategy of the firm.
But an analysis of the failure of LTCM shows that a big reason for its downfall
was the excessive use of leverage in trades---
an issue that isn't even part of options theory.
That appeared to be based on intuition.

Taleb also states that the crash of 1987 disproved modern portfolio theory (MPT),
which would seem to presume
that at least some significant proportion of fund managers used the method.
I find fund managers to be tight-lipped about their specific methods,
but one fund manager did tell me how
"learning the theory is important as a foundation
but 'real-world' decisions have to be based on practical experience, too."
In fact, I found no fund managers who didn't rely partly, if not mostly, on intuition.
Finally, if we are looking for explanations of the mortgage crisis,
neither MPT nor options theory had anything to do with the practice
of giving out mortgages to large numbers of people
lacking the ability to pay them.
That was more of a function of a system
that incentivized banks to give risky loans without actually accepting the risk.

Finally, Taleb seems to make a variety of other points
that, similar to the previous points, seem so inconsistent
he ends up undermining the point he makes.
For example, explaining the outcomes
in terms of the narrative fallacy committed by others
is sometimes itself a narrative fallacy.
Arguing that "experts" don't know so much
is not supported by quoting other experts.
He argues that rare events defy quantitative models,
but then gives specific examples
of computing rare events with quantitative models
(he shows the odds of getting the same result in a coin flip many times in a row
and argues the benefits of Mandelbrot's mathematical models
in the analysis of market fluctuations).

Taleb criticizes the use of historical data in forecasts
but apparently sees no irony in his argument.
He looks at several examples in which history was a poor predictor.
In other words, he is assessing the validity of using historical examples
by using _historical examples_.
What Taleb and others prove with such examples
is merely that what I will call a _naive_ historical analysis can be very misleading.
Taleb demonstrates his point by using the example of a turkey.
The turkey had a great life right up until Thanksgiving.
So, for that turkey, history was a poor indicator.
So how is Taleb able to see this problem?
He simply looks at the larger history of turkeys.

All he is doing is using what we may call a _history of histories_,
or _meta-historical analysis_, to show how wrong naive historical analysis can be.
The error in historical analysis in a stock price, for example,
is to look only at the history of _that_ stock and only for recent history.
If we look at all historical analysis for a very long period of time,
we find how often naive historical analysis can be wrong.

Taleb's own "experience," as extensive as it might be (at least in finance),
is also just a historical analysis---just a very informal type
with lots of errors in both recall and analysis,
as shown in [[#Chapter 7 The Limits Of Expert Knowledge Why We Don't Know What We Think We Know About Uncertainty|chapter 7]].
No thinking person can ever honestly claim
to have formed any idea totally independent of previous observations.
It just doesn't happen.

Even Taleb's ludic fallacy seems to be a fallacy itself.
Sam Savage calls it the "ludic fallacy-fallacy."
As Savage describes it, we cannot rationally address real-world problems of uncertainty
"_without_ first understanding the simple arithmetic of dice, cards, and spinners."
Of course, Taleb is right when he says we shouldn't _assume_
that we have defined any problem perfectly.
That certainly would be an error, and if that were Taleb's point, that would be valid.
But, again, whether a particular model is perfect is not the right question.
The most relevant question is whether a probabilistic model---
even a simple one---
outperforms the alternative model, such as intuition.

#### Major Mathematical Misconceptions

#### We're Special: The Belief That Risk Analysis Might Work, But Not Here

#### Notes

### Chapter 10: Where Even The Quants Go Wrong: Common And Fundamental Errors In Quantitative Models

#### A Survey Of Analysts Using Monte Carlos

#### The Risk Paradox

#### Financial Models And The Shape Of Disaster: Why Normal Isn't So Normal

#### Following Your Inner Cow: The Problem With Correlations

#### The Measurement Inversion

Consider a decision analysis model for a new product.
You have uncertainties about the cost and duration of development, materials costs once production starts, demand in different markets, and so on.
This would be just like a typical cost-benefit analysis with a cash flow but instead of exact numbers, we use probability distributions to represent our uncertainty.
We can even include the probability of a development project failure (no viable product was developed and the project was cancelled) or even more disastrous scenarios such as a major product recall.
Any of these variables could be measured further with some cost and effort.
So, which one would you measure first and how much would you be willing to spend?
For years, I've been computing the value of additional information on every uncertain variable in a model.

Suppose we ran ten thousand scenarios in a simulation and determined that 1,500 of these scenarios resulted in a net loss.
If we decide to go ahead with this product development, and we get one of these undesirable scenarios, the amount of money we would lose is the _opportunity loss (OL)_---the cost of making the wrong choice.
If we didn't lose money, then the OL was zero.
We can also have an OL if we decide not to approve the product but then find out we _could_ have made money.
In the case of rejecting the product, the OL is the difference between the lease and the money we made on the widgets if we would have made money---zero if the equipment did not make money (in which case we were right to reject the idea).

The _expected opportunity loss (EOL)_ is each possible opportunity loss times the chance of that loss---in other words, the chance of being wrong times the cost of being wrong.
In our Monte Carlo simulation, we simply average the OL for all of the scenarios.
For now, let's say that given the current level of uncertainty about this product, you still think the lease is a good idea.
So we average all 1500 scenarios the OL was positive (we lost money) and 8500 scenarios where OL was zero (me made the right choice).
Suppose we find that the EOL is about $600,000.

The EOL is equivalent to another term
called the _expected value of perfect information (EVPI)_.
The EVPI is the most you would reasonably be willing to pay if you could eliminate all uncertainty about this decision.
Although it is almost impossible to ever get perfect information and eliminate all uncertainty, this value is useful as an absolute upper bound.
If we can reduce the $600,000 EOL by half with a market survey that would cost $18,000, then the survey is probably a good deal.
If you want to see a spreadsheet calculation of this type of problem, go to this book's website at [www.howtomeasureanything.com/riskmanagement](http://www.howtomeasureanything.com/riskmanagement).

This becomes more enlightening when we compute the value of information for each variable in a model, especially when the models get very large.
This way we not only get an idea for how much to spend on measurement but also which specific variables we need to measure and how much we might be willing to spend on them.
I have done this calculation for more than 150 quantitative decision models in which most had about fifty to one hundred variables (for a total of about 10,000 variables, conservatively).
From this, I've seen patterns that still persist every time I add more analysis to my library.
The two main findings are:

* Relatively few variables require further measurement---
    but there are almost always _some_.

* The uncertain variables with the highest EVPI
    (highest value for further measurement)
    tend to be those that the organization almost never measures,
    _and_ the variables they _have_ been measuring have, on average, the lowest EVPI.

I call this second finding the _measurement inversion_,
and I've seen it in IT portfolios, military logistics, environmental policy,
venture capital, market forecasts, and every other place I've looked.

> [!info] The Measurement Inversion
> The persistent tendency to focus on the least valuable measurements
> at the expense of those more likely to improve decisions.

It seems that almost everybody, everywhere,
is systematically measuring all the wrong things.
It is so pervasive and impactful
that I have to wonder how much this affects the gross domestic product.
Organizations appear to measure what they know how to measure
without wondering whether they should learn new measurement methods for very-high-value uncertainties.

How does tendency toward a measurement inversion
affect risk assessment and, in turn, risk management?
Highly uncertain and impactful risks
may tend to get much less analysis than the easier-to-list, mundane events.
The possibility of existential risks due to a major product recall,
corporate scandal, major project failure, or factory disaster
get less attention than the listing of much more routine and less impactful events.
Conventional risk matrices are often populated with risks
that are estimated to be so likely that they should happen several times a year.
I've even seen risks estimated to be 80 percent, 90 percent,
or even 100 percent probable in the next twelve months.
At that level, that is more of a reliable cost of doing business.
Of course, cost control is also important but it's not the same as risk management.
If it is something you routinely _budget_ for, it might not be the kind of risk
upper management needs to see in a risk assessment.

Also, as an analyst myself as well as a manager of many analysts,
I can tell you that analysts are not immune to wanting to use a modeling method
because it uses the latest buzzwords.
Perhaps an analyst just recently learned about random forests, Bayesian networks, or deep learning.
If she finds it interesting and wants to use it,
she can find a way to make it part of the solution.
The measurement inversion shows that our intuition fails us
regarding where we need to spend more time reducing uncertainty in probabilistic models.
Unless we estimate the value of information,
we may go down the deep rabbit hole of adding more and more detail to a model
and trying to gather data on less relevant issues.

Periodically, we just need to back up and ask if we are really capturing the main risks
and if we are adding detail where it informs decisions most.

#### Is Monte Carlo Too Complicated?

#### Notes

## Part Three: How To Fix It

### Chapter 11: Starting With What Works

#### Speak The Language

##### Quantifying the Appetite for Risk

##### Break It Down, Then Do the Math

#### Getting Your Probabilities Calibrated

#### Using Data For Initial Benchmarks

##### It's Been Measured Before

##### You Have More Data Than You Think

##### You Need Less Data Than You Think

##### A Reference Class Error: Revisiting the Turkey

#### Checking The Substitution

#### Simple Risk Management

We've talked about introducing the very basics of quantitative risk analysis.
Now we need to turn that basic risk analysis
into a basic risk management framework.
When you've completed your initial list of risks,
you will want to compare your LEC to the risk tolerance curve.
This tells you whether your current risk is acceptable,
but it is not the whole story.

You also need to determine which risk mitigations to employ
to reduce risk further.
If a mitigation reduces a particular risk by about 50 percent
but costs \$200,000, is it worth it?
As mentioned in [[#Chapter 4 Getting Started A Simple Straw Man Quantitative Model|chapter 4]],
this requires knowing our return on mitigation (RoM).
RoM is similar to a return on investment but for risk mitigations.
To compute this we need to work out a monetized risk
so that we can monetize the reduction of risk.
Then we could use RoM together with the LEC and the risk tolerance curve
as shown in [Exhibit 11.2].

To *mitigate* a risk is to moderate or alleviate a risk---to lessen it in some way.
Higher risks may be deliberately accepted for bigger opportunities
but even in those cases
decision-makers will not want to accept more risk than is necessary.
It is common in risk management circles
to think of a choice among four basic alternatives
for managing a given risk:

* **Avoid:**
    We can choose not to take an action
    that would create an exposure of some kind.
    We can avoid the merger, the new technology investment,
    the subprime mortgage market, and so on.
    This effectively makes that particular risk zero
    but might increase risks in other areas
    (e.g., the lack of taking risks in R&D investments
    might make a firm less competitive).

* **Reduce:**
    The manager goes ahead with the investment
    or other endeavors that have some risks
    but takes steps to lessen them.
    The manager can decide to invest in the new chemical plant
    but implement better fire-safety systems
    to address a major safety risk.

* **Transfer:**
    The manager can give the risk to someone else.
    Insurance is the best example of this.
    The manager can buy insurance
    without necessarily taking other steps to lessen the risk of the event
    (e.g., buying fire insurance instead of investing
    in advanced fire-prevention systems).
    Risk can also be transferred to customers
    or other stakeholders by contract
    (e.g., a contract that states,
    "The customer agrees that the company is not responsible for...").

* **Retain:**
    This is the default choice for any risk management.
    You simply accept the risk as it is.

I, and some risk managers I know,
find the boundaries between these a little murky.
A transfer of risk is a reduction or avoidance of risk
to the person transferring it away.
A reduction in risk is really the avoidance of particular risks
that are components of a larger risk.
The ultimate objective of risk management should be, after all,
the reduction of the total risk to the firm for a given expected return,
whether through the transfer or avoidance of risks
or the reduction of specific risks.
If total risk is merely retained,
then it may be no different from not managing risks at all.

##### Risk Mitigation

Y. S. Kong was the treasurer and chief strategic planner
at the HAVI Group in Illinois,
a consortium of major distribution service companies
operating in forty countries.
Y. S. prefers to categorize risk management activities
by specific risk mitigation actions he calls *risk filters*.
"We have four sequential 'risk filters':
transference, operational, insurance, and retention," explains Y. S.
The first preference is to transfer risks to customers or suppliers
through their contracts.
The second filter---operational---
is to address risks through better systems, procedures, roles, and so on.
The third filter is to insure the risk (technically, this is also transferring risks).
Finally, the retention of risk is not so much a filter,
but where the other risks land if they don't get filtered out earlier.
Even so, Y. S. as the treasurer
is tasked with ensuring they have an adequate asset position
to absorb any risk that ends up in this final bucket.

In the following list, I added a couple of items to Y. S.'s list
and expanded on each of them to make it as general as possible.
Unlike HAVI's risk filters, the order of this list does not imply a prescribed priority.
Note that this is a long, but still partial, list of risk mitigation alternatives:

* **Selection processes for major exposures:**
    This is the analysis of decisions that create new sources of potential losses
    to ensure that the risk being taken is justified by the expected reward.
    For example:

    * Risk/return analysis of major investments technology,
        new products, and so on

    * Selection of loan risks for banks;
        accounts receivable risks for other types of firms

* **Insurance:**
    This comes in dozens of specialized categories,
    but here are a few of the many general groups:

    * Insurance against loss of specific property and other assets,
        including fire, flood, and so on

    * Various liabilities, including product liability

    * Insurance for particular trades or transportation of goods,
        such as marine insurance or the launch of a communications satellite

    * Life insurance for key officers

    * Reinsurance, generally purchased by insurance companies,
        to help risks that may be concentrated in certain areas
        (e.g., hurricane insurance in Florida,
        earthquake insurance in California, etc.)

* **Contractual risk transfer:**
    Business contracts include various clauses
    such as "*X* agrees the company is not responsible for *Y*,"
    including contracts with suppliers, customers, employees, partners,
    or other stakeholders.

* **Operational risk reduction:**
    This includes everything a firm might do internally
    through management initiatives to reduce risks, including the following:

    * Safety procedures
    * Training
    * Security procedures and systems
    * Emergency/contingency planning
    * Investments in redundant and/or high-reliability processes,
        such as multiple IT operations sites, new security systems, and so on

    * Organizational structures or roles
        defining clear responsibilities for and authority over
        certain types of risks
        (a shift safety officer, a chief information security officer, etc.)

* **Liquid asset position:**
    This is the approach to addressing the retention of risk
    but still attempting to absorb some consequences
    by using liquid reserves (i.e., cash, some inventory, etc.)
    to ensure losses would not be ruinous to the firm.

* **Compliance remediation:**
    This is not so much its own category of risk mitigation
    because it can involve any combination of the previously mentioned items.
    But it is worth mentioning
    simply because it is a key driver for so much of current risk mitigation.
    This is, in part, a matter of "crossing the *t*'s and dotting the *i*'s"
    in the growing volume of regulatory requirements.

* **Legal structure:**
    This is the classic example of limiting liability of owners
    by creating a corporation.
    But risk mitigation can take this further even for existing firms
    by compartmentalizing various risks
    into separate corporate entities as subsidiaries,
    or for even more effective insulation from legal liability,
    as completely independent spin-offs.

* **Activism:**
    This is probably the rarest form of risk mitigation
    because it is practical for relatively few firms, but it is important.
    Successful efforts to limit liabilities for companies in certain industries
    have been won by advocating new legislation.
    Examples are the Private Securities Litigation Reform Act of 1995,
    which limits damage claims against securities firms;
    Michigan's 1996 FDA Defense law,
    which limits product liability for drugs that were approved by the FDA;
    and the Digital Millennium Copyright Act of 1998,
    which limits the liability of firms that provide a conduit
    for the transmission of data from damages
    that may be caused by the sources of the data.

As always, an informed risk mitigation starts with an identification
and then some kind of assessment of risks.
Once a risk manager knows what the risks are,
steps can be taken to address them in some way.
It might seem that some extremely obvious risks
can be managed without much of an assessment effort
(e.g., implementing full backup and recovery
at a data center that doesn't have it,
installing security systems at a major jewelry store, etc.).
But in most environments, there are numerous risks,
each with one or more potential risk mitigation strategies
and a limited number of resources.
We have to assess not only the initial risks
but also how much the risk would change
if various precautions were taken.
Then those risk mitigation efforts, once chosen,
have to be monitored in the same fashion
and the risk management cycle can begin again (see [exhibit 11.3]).
Notice that the assessment of risks appears prior to
and as part of the selection of risk mitigation methods.

Now, getting this far should be an improvement over unaided gut feel
and a big improvement over methods such as the risk matrix
or qualitative scoring methods.
Of course, this approach still makes many and big, simplifying assumptions.
In [[#Chapter 12 Improving The Model|chapter 12]],
we will review some issues worth considering
when you are ready to add more realism to the model.

#### Notes

### Chapter 12: Improving The Model

#### Empirical Inputs

#### Adding Detail To The Model

#### Advanced Methods For Improving Expert's Subjective Estimates

#### Other Monte Carlo Tools

#### Self-Examinations For Modelers

#### Notes

### Chapter 13: The Risk Community: Intra- And Extra-Organizational Issues Of Risk Management

#### Getting Organized

#### Managing The Model

#### Incentives For A Calibrated Culture

#### Extraorganizational Issues: Solutions Beyond Your Office Building

##### Growing the Profession

Of all the professions in risk management,
that of the actuary is the only one
that is actually a legally recognized profession.
Becoming an actuary requires a demonstration of proficiency
through several standardized tests.
It also means adopting a code of professional ethics
enforced by some licensing body.
When actuaries sign their name
to the Statement of Actuarial Opinion of an insurance company,
they put their license on the line.
As with doctors and lawyers,
if they lose their license, they cannot just get another job next door.
The industry of modelers of uncertainties outside of insurance
could benefit greatly from this level of professional standards.

Standards organizations,
government affiliated and otherwise,
have always been a key part of what makes a profession a profession.
But standards organizations such as PMI, NIST, and others
are all guilty of explicitly promoting
the ineffectual methods previously debunked.
The scoring methods developed by these institutions
should be disposed of altogether.
These organizations should stay out of the business
of designing risk analysis methods
until they begin to involve people with quantitative decision analysis backgrounds
in their standards-development process.
Professionals should take charge of the direction their profession evolves
by insisting the standards move in this direction.

#### Practical Observations From Trustmark

#### Final Thoughts On Quantitative Models And Better Decisions

#### Notes

## Appendix: Additional Calibration Tests And Answers

### Calibration Test for Ranges: A

1. How many feet tall is the Hoover Dam?
2. How many inches long is a \$20 bill?
3. What percentage of aluminum is recycled in the United States?
4. When was Elvis Presley born?
5. What percentage of the atmosphere is oxygen by weight?
6. What is the latitude of New Orleans? \[_Hint_: Latitude is 0 degrees at the equator and 90 degrees at the North Pole.\]
7. In 1913, the United States military owned how many airplanes?
8. The first European printing press was invented in what year?
9. What percentage of all electricity consumed in US households was used by kitchen appliances in 2001?
10. How many miles tall is Mount Everest?
11. How long is Iraq's border with Iran in kilometers?
12. How many miles long is the Nile?
13. In what year was Harvard founded?
14. What is the wingspan (in feet) of a Boeing 747 jumbo jet?
15. How many soldiers were in a Roman legion?
16. What is the average temperature of the abyssal zone (where the oceans are more than 6,500 feet deep) in degrees F?
17. How many feet long is the Space Shuttle _Orbiter_ (excluding the external tank)?
18. In what year did Jules Verne publish _20,000 Leagues Under the Sea_?
19. How wide is the goal in field hockey (in feet)?
20. The Roman Coliseum held how many spectators?

### Answers to Calibration Test for Ranges: A

1. 726 feet
2. 63/16ths (6.1417) inches
3. 45 percent
4. 1935
5. 21 percent
6. 29.95
7. 23
8. 1450
9. 26.7 percent
10. 5.5 miles
11. 1,458 kilometers
12. 4,160 miles
13. 1636
14. 196 feet
15. 6,000
16. 39 F
17. 122 feet
18. 1870
19. 12 feet
20. 50,000

### Calibration Test for Ranges: B

1. The first probe to land on Mars, _Viking 1,_ landed there in what year?
2. How old was the youngest person to fly into space?
3. How many meters tall is the Sears Tower?
4. What was the maximum altitude of the _Breitling Orbiter 3,_ the first balloon to circumnavigate the globe, in miles?
5. On average, what percentage of the total software development project effort is spent in design?
6. How many people were permanently evacuated after the Chernobyl nuclear power plant accident?
7. How many feet long were the largest airships?
8. How many miles is the flying distance from San Francisco to Honolulu?
9. The fastest bird, the falcon, can fly at a speed of how many miles per hour in a dive?
10. In what year was the double helix structure of DNA discovered?
11. How many yards wide is a football field?
12. What was the percentage growth in internet hosts from 1996 to 1997?
13. How many calories are in 8 ounces of orange juice?
14. How fast would you have to travel at sea level to break the sound barrier (in mph)?
15. How many years was Nelson Mandela in prison?
16. What is the average daily calorie intake in developed countries?
17. In 1994, how many nations were members of the United Nations?
18. The Audubon Society was formed in the United States in what year?
19. How many feet high is the world's highest waterfall (Angel Falls, Venezuela)?
20. How deep beneath the sea was the Titanic found (in miles)?

### Answers to Calibration Test for Ranges: B

1. 1976
2. 26
3. 443 meters
4. 6.9 miles
5. 20 percent
6. 350,000
7. 803 feet
8. 2,394 miles
9. 200 mph
10. 1953
11. 53.3 yards
12. 70 percent
13. 120
14. 760 mph
15. 27
16. 3,300 calories
17. 184
18. 1905
19. 3,212 feet
20. 3.36 miles

### Calibration Test for Binary: A

1. The Lincoln Highway was the first paved road in the United States, and it ran from Chicago to San Francisco.
2. The Silk Road joined the two ancient kingdoms of China and Afghanistan.
3. More American homes have microwaves than telephones.
4. _Doric_ is an architectural term for a shape of roof.
5. The World Tourism Organization predicts that Europe will still be the most popular tourist destination in 2020.
6. Germany was the second country to develop atomic weapons.
7. A hockey puck will fit in a golf hole.
8. The Sioux were one of the Plains Indian tribes.
9. To a physicist, _plasma_ is a type of rock.
10. The Hundred Years' War was actually more than a century long.
11. Most of the fresh water on Earth is in the polar ice caps.
12. The Academy Awards ("Oscars") began over a century ago.
13. There are fewer than two hundred billionaires in the world.
14. In Excel, `^` means "take to the power of."
15. The average annual salary of airline captains is over \$150,000.
16. By 1997, Bill Gates was worth more than \$10 billion.
17. Cannons were used in European warfare by the eleventh century.
18. Anchorage is the capital of Alaska.
19. Washington, Jefferson, Lincoln, and Grant are the four presidents whose heads are sculpted into Mount Rushmore.
20. John Wiley & Sons is not the largest book publisher.

### Answers for Calibration Test Binary: A

1. FALSE
2. FALSE
3. FALSE
4. FALSE
5. TRUE
6. FALSE
7. TRUE
8. TRUE
9. FALSE
10. TRUE
11. TRUE
12. FALSE
13. FALSE
14. TRUE
15. FALSE
16. TRUE
17. FALSE
18. FALSE
19. FALSE
20. TRUE

### Calibration Test for Binary: B

1. Jupiter's "Great Red Spot" is larger than Earth.
2. The Brooklyn Dodgers' name was short for "trolley car dodgers."
3. _Hypersonic_ is faster than _subsonic_.
4. A _polygon_ is three-dimensional and a _polyhedron_ is two-dimensional.
5. A 1-watt electric motor produces 1 horsepower.
6. Chicago is more populous than Boston.
7. In 2005, WalMart sales dropped below \$100 billion.
8. Post-it Notes were invented by 3M.
9. Alfred Nobel, whose fortune endows the Nobel Peace Prize, made his fortune in oil and explosives.
10. A BTU is a measure of heat.
11. The winner of the first Indianapolis 500 clocked an average speed of under 100 mph.
12. Microsoft has more employees than IBM.
13. Romania borders Hungary.
14. Idaho is larger (in area) than Iraq.
15. Casablanca is on the African continent.
16. The first manmade plastic was invented in the nineteenth century.
17. A chamois is an alpine animal.
18. The base of a pyramid is in the shape of a square.
19. Stonehenge is located on the main British island.
20. Computer processors double in power every three months or less.

### Answers for Calibration Test Binary: B

1. TRUE
2. TRUE
3. TRUE
4. FALSE
5. FALSE
6. TRUE
7. FALSE
8. TRUE
9. TRUE
10. TRUE
11. TRUE
12. FALSE
13. TRUE
14. FALSE
15. TRUE
16. TRUE
17. TRUE
18. TRUE
19. TRUE
20. FALSE