Files
zmVault/hubbard_2020_failure.md
T

55 KiB

title, tags, author, edition, publisher, subtitle, type, year
title tags author edition publisher subtitle type year
The Failure of Risk Management
authorship/other
exclude-from-word-count
topic/risk
type/media/book
Douglas W. Hubbard Second John Wiley & Sons Why It's Broken and How to Fix It book 2020

The Failure of Risk Management

%% This note, with the exception of comments like this one (reserved for notes on transcription) consists only of content from the text. For commentary see the companion the-failure-of-risk-management. %%

Part One: An Introduction To The Crisis

Chapter 1: Healthy Skepticism For Risk Management

A "Common Mode Failure"

What Failure Means

Scope And Objectives Of This Book

Notes

Chapter 2: A Summary Of The Current State Of Risk Management

A Short And Entirely-Too-Superficial History Of Risk

Current State Of Risk Management In The Organization

Current Risks And How They Are Assessed

Notes

Chapter 3: How Do We Know What Works?

Anecdote: The Risk Of Outsourcing Drug Manufacturing

Why It's Hard To Know What Works

An Assessment Of Self-Assessments

Potential Objective Evaluations Of Risk Management

What We May Find

Notes

Chapter 4: Getting Started: A Simple Straw Man Quantitative Model

A Simple One-For-One Substitution

The Expert As The Instrument

A Quick Overview Of "Uncertainty Math"

Establishing Risk Tolerance

Supporting The Decision: A Return On Mitigation

Making The Straw Man Better

Note

Part Two: Why It's Broken

Chapter 5: The "Four Horsemen" Of Risk Management: Some (Mostly) Sincere Attempts To Prevent An Apocalypse

Actuaries

War Quants: How World War II Changed Risk Analysis Forever

Economists

Management Consulting: How A Power Tie And A Good Pitch Changed Risk Management

Comparing The Horsemen

Major Risk Management Problems To Be Addressed

Notes

Chapter 6: An Ivory Tower Of Babel: Fixing The Confusion About Risk

The Frank Knight Definition

Knight's Influence In Finance And Project Management

A Construction Engineering Definition

Risk As Expected Loss

Defining Risk Tolerance

Defining Probability

Enriching The Lexicon

Notes

Chapter 7: The Limits Of Expert Knowledge: Why We Don't Know What We Think We Know About Uncertainty

The Right Stuff: How A Group Of Psychologists Might Save Risk Analysis

Mental Math: Why We Shouldn't Trust The Numbers In Our Heads

"Catastrophic" Overconfidence

The Mind Of "Aces": Possible Causes And Consequences Of Overconfidence

Unless managers take steps to offset overconfidence in assessments of probabilities, they will consistently underestimate various risks (i.e., they will be more confident than they should be that some disaster won't occur). This may have had some bearing on very-high-profile disasters, such as those of the Space Shuttle Orbiters Challenger and Columbia.

The Nobel Prize-winning physicist, Richard Feynman, was asked to participate in the investigation of the first Space Shuttle accident (involving Challenger).

What he found were some risk assessments that seemed at first glance to be obviously optimistic. He noted the following in the Rogers Commission Report on the Space Shuttle Challenger Accident:

It appears that there are enormous differences of opinion as to the probability of a failure with loss of vehicle and of human life. The estimates range from roughly 1 in 100 to 1 in 100,000. The higher figures [1 in 100] come from the working engineers and the very low figures [1 in 100,000] from management. What are the causes and consequences of this lack of agreement? Since 1 part in 100,000 would imply that one could put a Shuttle up each day for 300 years expecting to lose only one, we could properly ask "What is the cause of management's fantastic faith in the machinery?"1

Feynman believed that if management decisions to launch were based on such an extraordinary confidence in the Shuttle, then these decisions were flawed. As was Feynman's frequent practice, he applied simple tests and reality checks that would cast doubt on these claims.

Perhaps an obvious explanation is the conflict of interest. Are managers really incentivized to be honest with themselves and others about these risks? No doubt, that is a factor, just as it was probably a factor in the assessments of risks taken by bank managers in 2008, whether or not it was consciously considered. However, individuals showed overconfidence even in situations when they had no stake in the outcome (trivia tests, etc.).

JDM research has shown that both the incentives and the amount of effort put into identifying possible surprises will make a difference in overconfidence.2 Some of the sources of overconfidence would affect not only managers who depend on subjective estimates but also those who believe they are using sound analysis of historical data. Managers will fail to consider ways in which human errors affect systems and will fail to consider common mode and cascade system failures.3

There may also a tendency to relax our concerns for infrequent but catastrophic events when some time passes without experiencing the event. Robin Dillon-Merrill, a decision and risk analysis professor at Georgetown University, noticed this tendency when she was studying the risk perceptions of NASA engineers prior to the Space Shuttle Columbia accident. The Columbia Accident Investigation Report noted the following:

The shedding of External Tank foam--- the physical cause of the Columbia accident---had a long history. Damage caused by debris has occurred on every Space Shuttle flight, and most missions have had insulating foam shed during ascent. This raises an obvious question: Why did NASA continue flying the Shuttle with a known problem that violated design requirements?4

Dillon-Merrill considers each time that foam fell off the external tank of the Shuttle, but where the Shuttle still had a successful mission, to be a "near miss." Her proposal was that near misses are an opportunity to learn that is rarely exploited. She interviewed NASA staff and contractors about how they judged near misses and found two very interesting phenomena that in my opinion have important implications for risk management in general.

Perhaps not surprisingly, she found that near misses and successes were both judged much more favorably than failures. But were these near-miss events being rated more like a failure than a mission success? Did engineers take each near miss as a red flag warning about an impending problem? Incredibly, just the opposite occurred. The study included an experiment where NASA staff and students were asked to choose among various options for hypothetical unmanned space missions. The options included decisions like whether to skip a test due to schedule constraints. Some subjects were given near miss data and some were not. The study found that people with the near-miss information were more likely to choose the riskier alternative.5

The near miss interpretation paradox: People with near-miss information were more likely to make the riskier choice than people who did not have information about near misses.

It is possible that managers were looking at each near miss and thinking that because nothing had happened yet, perhaps the system was more robust than they thought. Or it might be more subtle than that. Dillon-Merrill found that when people have a known exposure to some relatively unlikely risk, their tolerance for that risk seems to increase even though they may not be changing their estimate of the probability of the risk.

Imagine that you are in an area exposed to hurricane risks. Authorities confirm that there is a 3 percent chance of injury or death each time you do not evacuate when ordered to for a hurricane warning. If you happen to make it through two or three hurricanes without harm, you will become more tolerant of that risk.

Note that you are not actually changing your estimate of the probability of the harm (that was provided by authorities); you are simply becoming more numb to the risk as it is.

Now imagine the implications of this for Wall Street. If they have a few good years, everybody will start to become more "risk tolerant" even if they are not changing their underlying forecasts about the probabilities of a financial crisis. Now that the mortgage uncertainty has settled for a decade or so, will all managers, again, start to become more tolerant of risks?

There are other effects to consider when examining the psyche of upper-level decision-makers. Part of overestimating past performance is due to the tendency to underestimate how much we learned in the last big surprise. This is what Slovic and Fischhoff called the I-knew-it-all-along phenomenon. People will exaggerate how "inevitable" the event would have appeared before the event occurred. (News pundits talking about the mortgage crisis certainly make it sound as if it were inevitable, but where were they before the crisis occurred?)

They even remember their previous predictions in such a way that they, as Slovic put it, "exaggerate in hindsight what they knew in foresight." I hear the I-saw-that-coming claim so often that, if the claims were true, there would be virtually no surprises anywhere in the world. Two lines of dialog in the movie Wall Street revealed Oliver Stone's grasp of this phenomenon. After "Bud" (Charlie Sheen's character) had his initial big successes as a broker, his boss said, "The minute I laid eyes on you, I knew you had what it took." Later, when Bud was being arrested in the office for the crimes he committed to get those early successes, the same boss said, "The minute I laid eyes on you, I knew you were no good." Kahneman sums it up:

When they have made a decision, people don't even keep track of having made the decision or forecast. I mean, the thing that is absolutely the most striking is how seldom people change their minds. First, we're not aware of changing our minds even when we do change our minds. And most people, after they change their minds, reconstruct their past opinion--- they believe they always thought that.6

There is one other item about overconfidence that might be more unique to upper management or particularly successful traders. Some managers can point to an impressive track record of successes as evidence that a high level of confidence on virtually all matters is entirely justified on their part. Surely, if a portfolio manager can claim she had above-average market returns for five years, she must have some particularly useful insight into the market. An IT security manager who has presided over a virus-free, hacker-free environment much longer than his peers in other companies must have great skill, right?

Actually, luck can have more to do with success than we might be inclined to think. For example, a statistical analysis of World War I aces showed that Baron von Richthofen (aka The Red Baron) might have been lucky but not necessarily skilled.7 Two electrical engineering professors, Mikhail Simkin and Vwani Roychowdhury of the University of California at Los Angeles, examined the victories and losses for the 2,894 fighter pilots who flew for Germany. Together, they tallied 6,759 victories and 810 defeats. This is perhaps a suspiciously high win ratio but these numbers include shooting down unarmed scout and delivery planes. The Germans also had a technological advantage in the air during WWI. Furthermore, not all kills could be confirmed and the inflation of these numbers is certainly possible--- but there is no reason to assume that the Baron was less prone to exaggeration than others. Simkin and Roychowdhury showed that, given the number of pilots and the win ratio, there was about a 30 percent chance that, by luck alone, one pilot would have gotten eighty kills, the number Manfred von Richthofen is credited for.

This might describe a large number of "successful" executives who write popular books on the special insight they brought to the table, but who then sometimes find they are unable to repeat their success. Given the large number of candidates who spend their careers competing for a small number of upper-management positions, it is likely that some will have a string of successes just by chance alone. No doubt, some of these will be more likely to hold upper-management positions. In the same manner, some will also have a string of successes in a coin-flipping tournament in which there are a large number of initial players. But we know that the winners of this kind of contest are not just better coin-flippers. Sure, there is probably some skill in reaching upper management. But how much of it was more like winning a coin-flipping contest?

Inconsistencies And Artifacts: What Shouldn't Matter Does

Answers To Calibration Tests

Notes

A Few Examples Of Scores And Matrices

Does That Come In "Medium"?: Why Ambiguity Does Not Offset Uncertainty

Unintended Effects Of Scales: What You Don't Know Can Hurt You

Different But Similar-Sounding Methods And Similar But Different-Sounding Methods

Notes

Chapter 9: Bears, Swans And Other Obstacles To Improved Risk Management

Algorithm Aversion And A Key Fallacy

Algorithms Versus Experts: Generalizing The Findings

A Note About Black Swans

The exsupero ursus fallacy is reinforced by authors of very popular books who seem to depend heavily on some version of the fallacy. One such author is former Wall Street trader and mathematician Nassim Taleb. He wrote The Black Swan and other books critical of common practice in risk management, especially in (but not limited to) the financial world, as well as the nonquantitative hubris of Wall Street.

A heretic of financial convention, he argues that Nobel Prize-winning modern portfolio theory and options theory (briefly mentioned in chapter 5) are fundamentally flawed and are in fact no better than astrology. In fact, Taleb considers this prize is itself an intellectual fraud. After all, as he rightly points out, it was not established in the will of Alfred Nobel, but by the Royal Bank of Sweden seventy-five years after Nobel's death. He even claims that once, in a public forum, he riled up one such prizewinner to the point of red-faced, fist-pounding anger.

Taleb bases a lot of his thesis on the fact that the impact of chance is unappreciated by mostly everyone. He sees the most significant events in history as being completely unforeseeable. He calls these events black swans in reference to an old European expression that went something like "That's about as likely as finding a black swan." The expression was based on the fact that no European had ever seen a swan that was black---until Europeans traveled to Australia. Until the first black swans were sighted, black swans were a metaphor for impossibility. Taleb puts September 11, 2001, stock market crashes, major scientific discoveries, and the rise of Google in his set of black swans. Each event, he argues, was not only unforeseen but utterly unforeseeable based on our previous experience. People will routinely confuse luck with competence and they will presume that the lack of seeing an unusual event to date is somehow proof that the event cannot occur.

Managers, traders, and the media seem to be especially susceptible to these errors. Out of a large number of managers, some managers will have made several good choices in a row by chance alone. This is what I called the Red Baron effect in a previous chapter. Such managers will see their past success as indicators of competence and, unfortunately, will act with high confidence on equally erroneous thinking in the future. Taleb recognizes the problems of overconfidence researched by Kahneman and others. Indeed, Taleb says Kahneman is the only Economics Nobel Prize winner he respects.

I think part of Taleb's skepticism is refreshing and on point. I agree with many of Taleb's observations on the misplaced faith in some models and will discuss this further in the next chapter. I might even include Taleb as one source of inspiration for identifying new categories of fallacies (and giving it a Latin name in order to sound official). Taleb coined a fallacy he refers to as the ludic fallacy, derived from the Latin word for "games of chance." Taleb defines the ludic fallacy as the assumption that the real world necessarily follows the same rules as well-defined games of chance.

Now, here is where Taleb errs. He doesn't just argue that risk management is flawed. He argues that risk management itself is impossible and that all we can do is make ourselves antifragile. I think he is just using a very different definition of risk management--- which even he uses inconsistently. No matter what he calls it, he is promoting a particular set of (vaguely defined) methods that have the objective of reducing risk. This reduction in risk will require resources. Using the definition I propose in [chapter 6], determining how to use resources to reduce risk is part of risk management. He actually contradicts himself on this point when he promotes redundancy as a method of becoming antifragile and refers to it as the "central risk management property of natural systems." So, yes, we are both talking about risk management. He focuses on particular approaches to it, but it is risk management just the same.

Confusion and inconsistency about whether managing fragility is, in practice, part of managing risks is not the only problem in his thesis. Taleb commits every form of the exsupero ursus fallacy throughout most of what he writes.

Specifically, (1) he presumes the lack of perfection of one model automatically necessitates use of the other regardless of relative performance, (2) he commits the anecdotal fallacy when looking for evidence of relative performance, and (3) he presumes that a given model was even being used when he identifies them as the culprit in major risk events.

In an interview for Fortune Taleb claimed, "No model is better than a faulty model." Again, having no model is never an option. One way or another, a model is being used. Taleb's model is simply his common sense, which is, as Albert Einstein defines it, "merely the deposit of prejudice laid down in the human mind before the age of eighteen." As with every other model, common sense has its own special errors.

We've seen the research that shows overwhelming evidence of the flaws of unaided intuition compared to even simple statistical models, and Taleb offers no empirical data to the contrary. Taleb does briefly mention the work of Meehl but dismisses it. Without making any mention of the huge numbers of conclusive results by Meehl and his colleagues, Taleb claims the entire body of research is invalid by claiming "that these researchers did not have a clear idea of where the burden of empirical evidence lies" and goes on to suggest that they lacked "rigorous empiricism." He offers no details about how more than one hundred peer-reviewed, published studies by several researchers veers from the required rigorous empiricism.

Kahneman, who actually is a psychologist like Meehl, would apparently disagree with Taleb on Meehl's methods. Taleb considers Kahneman a significant influence on his work, but who does Kahneman consider to be a significant influence on his work? Meehl. I wouldn't presume to speak for Kahneman but I wonder if he might point out to Taleb how the burden of proof was accepted and met overwhelmingly by Meehl, whereas Taleb's evidence merely amounts to, at best, selected anecdotes of shortcomings or entirely imagined straw man arguments. Taleb even sometimes cites the work of Phil Tetlock to support some other point he makes but never references Tetlock's enormous twenty-year study where he concluded that it was "impossible" to find a domain where humans clearly outperformed algorithms.

Instead of relying on large controlled studies, Taleb commits the error of arguing that single events effectively disprove a probabilistic model. He uses the apparent unforeseeability of specific events as evidence of a flaw in risk analysis. The implication is that if quantitative analysis worked, then we could make exact predictions of specific and extraordinary events such as 9/11 or the rise of Google. When arguing against the use of various statistical models in economics he states that "the simple argument that Black Swans and tail events run the socioeconomic world---and these events cannot be predicted--- is sufficient to invalidate their statistics."[^09-12] Yes, the rare events---black swans--- are individually impossible to predict precisely. But unless he can show that his alternative model (apparently his intuition) would also have predicted such events exactly, then he commits exsupero ursus when he says imperfection alone is sufficient to prefer intuition over statistics.

In addition to Kahneman, it is worth pointing out others whose work Taleb cites to make a point but who, if you actually looked at what they are doing, would contradict Taleb. For example, Taleb says he admires the mathematician Edward Thorp, who developed a mathematically sound basis for card counting in blackjack in the 1960s. Now, if the objective of card counting was to predict every hand, even the most extraordinarily rare combinations as Taleb would seem to require, then Ed Thorp's method certainly fails. But Ed Thorp's method works---that's why the casinos quit letting him play--- because his system resulted in better bets on average after a large number of hands. Taleb is also a fan of the mathematician Benoit Mandelbrot, who used the mathematics of fractals to model financial markets. Similar to Thorp and Taleb, Mandelbrot was equally unable to predict specific extraordinary events exactly, but his models are preferred by some because they seem to generate more realistic patterns that look like they could be from real data.

If anecdotal evidence were sufficient to compare model performance, one could simply point out that Taleb's investment firm, Empirica Capital LLC, closed in 2004 after several years of mediocre returns.[^09-13] He had one very good year in 2000 (a 60 percent return) because while everyone else was betting on dot-com, he bet on dot-bomb. But the returns the following years were far enough below the market average that the good times couldn't outweigh the bad for his fund.

Similar to the news pundits rejecting Nate Silver's findings or the sportscasters rejecting the methods used by the Oakland A's, Taleb merely shows that it is possible to find an error in a model if one looks hard enough. Again, the question is not whether to model (intuition is a model, too) or whether one model is imperfect (both models are imperfect) but which measurably outperforms the other and does so in many trials not just single anecdotes.

Finally, Taleb makes the error of presuming what methods were actually being used when he blames them for an event. He argues, for example, that the downfall of long-term capital management (LTCM) disproves options theory. Recall that options theory won the Nobel Prize for Robert Merton and Myron Scholes, both of whom were on the board of directors for LTCM. The theory was presumably the basis of the trading strategy of the firm. But an analysis of the failure of LTCM shows that a big reason for its downfall was the excessive use of leverage in trades--- an issue that isn't even part of options theory. That appeared to be based on intuition.

Taleb also states that the crash of 1987 disproved modern portfolio theory (MPT), which would seem to presume that at least some significant proportion of fund managers used the method. I find fund managers to be tight-lipped about their specific methods, but one fund manager did tell me how "learning the theory is important as a foundation but 'real-world' decisions have to be based on practical experience, too." In fact, I found no fund managers who didn't rely partly, if not mostly, on intuition. Finally, if we are looking for explanations of the mortgage crisis, neither MPT nor options theory had anything to do with the practice of giving out mortgages to large numbers of people lacking the ability to pay them. That was more of a function of a system that incentivized banks to give risky loans without actually accepting the risk.

Finally, Taleb seems to make a variety of other points that, similar to the previous points, seem so inconsistent he ends up undermining the point he makes. For example, explaining the outcomes in terms of the narrative fallacy committed by others is sometimes itself a narrative fallacy. Arguing that "experts" don't know so much is not supported by quoting other experts. He argues that rare events defy quantitative models, but then gives specific examples of computing rare events with quantitative models (he shows the odds of getting the same result in a coin flip many times in a row and argues the benefits of Mandelbrot's mathematical models in the analysis of market fluctuations).

Taleb criticizes the use of historical data in forecasts but apparently sees no irony in his argument. He looks at several examples in which history was a poor predictor. In other words, he is assessing the validity of using historical examples by using historical examples. What Taleb and others prove with such examples is merely that what I will call a naive historical analysis can be very misleading. Taleb demonstrates his point by using the example of a turkey. The turkey had a great life right up until Thanksgiving. So, for that turkey, history was a poor indicator. So how is Taleb able to see this problem? He simply looks at the larger history of turkeys.

All he is doing is using what we may call a history of histories, or meta-historical analysis, to show how wrong naive historical analysis can be. The error in historical analysis in a stock price, for example, is to look only at the history of that stock and only for recent history. If we look at all historical analysis for a very long period of time, we find how often naive historical analysis can be wrong.

Taleb's own "experience," as extensive as it might be (at least in finance), is also just a historical analysis---just a very informal type with lots of errors in both recall and analysis, as shown in #Chapter 7 The Limits Of Expert Knowledge Why We Don't Know What We Think We Know About Uncertainty. No thinking person can ever honestly claim to have formed any idea totally independent of previous observations. It just doesn't happen.

Even Taleb's ludic fallacy seems to be a fallacy itself. Sam Savage calls it the "ludic fallacy-fallacy." As Savage describes it, we cannot rationally address real-world problems of uncertainty "without first understanding the simple arithmetic of dice, cards, and spinners." Of course, Taleb is right when he says we shouldn't assume that we have defined any problem perfectly. That certainly would be an error, and if that were Taleb's point, that would be valid. But, again, whether a particular model is perfect is not the right question. The most relevant question is whether a probabilistic model--- even a simple one--- outperforms the alternative model, such as intuition.

Major Mathematical Misconceptions

We're Special: The Belief That Risk Analysis Might Work, But Not Here

Notes

Chapter 10: Where Even The Quants Go Wrong: Common And Fundamental Errors In Quantitative Models

A Survey Of Analysts Using Monte Carlos

The Risk Paradox

Financial Models And The Shape Of Disaster: Why Normal Isn't So Normal

Following Your Inner Cow: The Problem With Correlations

The Measurement Inversion

Consider a decision analysis model for a new product. You have uncertainties about the cost and duration of development, materials costs once production starts, demand in different markets, and so on. This would be just like a typical cost-benefit analysis with a cash flow but instead of exact numbers, we use probability distributions to represent our uncertainty. We can even include the probability of a development project failure (no viable product was developed and the project was cancelled) or even more disastrous scenarios such as a major product recall. Any of these variables could be measured further with some cost and effort. So, which one would you measure first and how much would you be willing to spend? For years, I've been computing the value of additional information on every uncertain variable in a model.

Suppose we ran ten thousand scenarios in a simulation and determined that 1,500 of these scenarios resulted in a net loss. If we decide to go ahead with this product development, and we get one of these undesirable scenarios, the amount of money we would lose is the opportunity loss (OL)---the cost of making the wrong choice. If we didn't lose money, then the OL was zero. We can also have an OL if we decide not to approve the product but then find out we could have made money. In the case of rejecting the product, the OL is the difference between the lease and the money we made on the widgets if we would have made money---zero if the equipment did not make money (in which case we were right to reject the idea).

The expected opportunity loss (EOL) is each possible opportunity loss times the chance of that loss---in other words, the chance of being wrong times the cost of being wrong. In our Monte Carlo simulation, we simply average the OL for all of the scenarios. For now, let's say that given the current level of uncertainty about this product, you still think the lease is a good idea. So we average all 1500 scenarios the OL was positive (we lost money) and 8500 scenarios where OL was zero (me made the right choice). Suppose we find that the EOL is about $600,000.

The EOL is equivalent to another term called the expected value of perfect information (EVPI). The EVPI is the most you would reasonably be willing to pay if you could eliminate all uncertainty about this decision. Although it is almost impossible to ever get perfect information and eliminate all uncertainty, this value is useful as an absolute upper bound. If we can reduce the $600,000 EOL by half with a market survey that would cost $18,000, then the survey is probably a good deal. If you want to see a spreadsheet calculation of this type of problem, go to this book's website at www.howtomeasureanything.com/riskmanagement.

This becomes more enlightening when we compute the value of information for each variable in a model, especially when the models get very large. This way we not only get an idea for how much to spend on measurement but also which specific variables we need to measure and how much we might be willing to spend on them. I have done this calculation for more than 150 quantitative decision models in which most had about fifty to one hundred variables (for a total of about 10,000 variables, conservatively). From this, I've seen patterns that still persist every time I add more analysis to my library. The two main findings are:

  • Relatively few variables require further measurement--- but there are almost always some.

  • The uncertain variables with the highest EVPI (highest value for further measurement) tend to be those that the organization almost never measures, and the variables they have been measuring have, on average, the lowest EVPI.

I call this second finding the measurement inversion, and I've seen it in IT portfolios, military logistics, environmental policy, venture capital, market forecasts, and every other place I've looked.

[!info] The Measurement Inversion The persistent tendency to focus on the least valuable measurements at the expense of those more likely to improve decisions.

It seems that almost everybody, everywhere, is systematically measuring all the wrong things. It is so pervasive and impactful that I have to wonder how much this affects the gross domestic product. Organizations appear to measure what they know how to measure without wondering whether they should learn new measurement methods for very-high-value uncertainties.

How does tendency toward a measurement inversion affect risk assessment and, in turn, risk management? Highly uncertain and impactful risks may tend to get much less analysis than the easier-to-list, mundane events. The possibility of existential risks due to a major product recall, corporate scandal, major project failure, or factory disaster get less attention than the listing of much more routine and less impactful events. Conventional risk matrices are often populated with risks that are estimated to be so likely that they should happen several times a year. I've even seen risks estimated to be 80 percent, 90 percent, or even 100 percent probable in the next twelve months. At that level, that is more of a reliable cost of doing business. Of course, cost control is also important but it's not the same as risk management. If it is something you routinely budget for, it might not be the kind of risk upper management needs to see in a risk assessment.

Also, as an analyst myself as well as a manager of many analysts, I can tell you that analysts are not immune to wanting to use a modeling method because it uses the latest buzzwords. Perhaps an analyst just recently learned about random forests, Bayesian networks, or deep learning. If she finds it interesting and wants to use it, she can find a way to make it part of the solution. The measurement inversion shows that our intuition fails us regarding where we need to spend more time reducing uncertainty in probabilistic models. Unless we estimate the value of information, we may go down the deep rabbit hole of adding more and more detail to a model and trying to gather data on less relevant issues.

Periodically, we just need to back up and ask if we are really capturing the main risks and if we are adding detail where it informs decisions most.

Is Monte Carlo Too Complicated?

Notes

Part Three: How To Fix It

Chapter 11: Starting With What Works

Speak The Language

Quantifying the Appetite for Risk
Break It Down, Then Do the Math

Getting Your Probabilities Calibrated

Using Data For Initial Benchmarks

It's Been Measured Before
You Have More Data Than You Think
You Need Less Data Than You Think
A Reference Class Error: Revisiting the Turkey

Checking The Substitution

Simple Risk Management

We've talked about introducing the very basics of quantitative risk analysis. Now we need to turn that basic risk analysis into a basic risk management framework. When you've completed your initial list of risks, you will want to compare your LEC to the risk tolerance curve. This tells you whether your current risk is acceptable, but it is not the whole story.

You also need to determine which risk mitigations to employ to reduce risk further. If a mitigation reduces a particular risk by about 50 percent but costs $200,000, is it worth it? As mentioned in #Chapter 4 Getting Started A Simple Straw Man Quantitative Model, this requires knowing our return on mitigation (RoM). RoM is similar to a return on investment but for risk mitigations. To compute this we need to work out a monetized risk so that we can monetize the reduction of risk. Then we could use RoM together with the LEC and the risk tolerance curve as shown in [Exhibit 11.2].

To mitigate a risk is to moderate or alleviate a risk---to lessen it in some way. Higher risks may be deliberately accepted for bigger opportunities but even in those cases decision-makers will not want to accept more risk than is necessary. It is common in risk management circles to think of a choice among four basic alternatives for managing a given risk:

  • Avoid: We can choose not to take an action that would create an exposure of some kind. We can avoid the merger, the new technology investment, the subprime mortgage market, and so on. This effectively makes that particular risk zero but might increase risks in other areas (e.g., the lack of taking risks in R&D investments might make a firm less competitive).

  • Reduce: The manager goes ahead with the investment or other endeavors that have some risks but takes steps to lessen them. The manager can decide to invest in the new chemical plant but implement better fire-safety systems to address a major safety risk.

  • Transfer: The manager can give the risk to someone else. Insurance is the best example of this. The manager can buy insurance without necessarily taking other steps to lessen the risk of the event (e.g., buying fire insurance instead of investing in advanced fire-prevention systems). Risk can also be transferred to customers or other stakeholders by contract (e.g., a contract that states, "The customer agrees that the company is not responsible for...").

  • Retain: This is the default choice for any risk management. You simply accept the risk as it is.

I, and some risk managers I know, find the boundaries between these a little murky. A transfer of risk is a reduction or avoidance of risk to the person transferring it away. A reduction in risk is really the avoidance of particular risks that are components of a larger risk. The ultimate objective of risk management should be, after all, the reduction of the total risk to the firm for a given expected return, whether through the transfer or avoidance of risks or the reduction of specific risks. If total risk is merely retained, then it may be no different from not managing risks at all.

Risk Mitigation

Y. S. Kong was the treasurer and chief strategic planner at the HAVI Group in Illinois, a consortium of major distribution service companies operating in forty countries. Y. S. prefers to categorize risk management activities by specific risk mitigation actions he calls risk filters. "We have four sequential 'risk filters': transference, operational, insurance, and retention," explains Y. S. The first preference is to transfer risks to customers or suppliers through their contracts. The second filter---operational--- is to address risks through better systems, procedures, roles, and so on. The third filter is to insure the risk (technically, this is also transferring risks). Finally, the retention of risk is not so much a filter, but where the other risks land if they don't get filtered out earlier. Even so, Y. S. as the treasurer is tasked with ensuring they have an adequate asset position to absorb any risk that ends up in this final bucket.

In the following list, I added a couple of items to Y. S.'s list and expanded on each of them to make it as general as possible. Unlike HAVI's risk filters, the order of this list does not imply a prescribed priority. Note that this is a long, but still partial, list of risk mitigation alternatives:

  • Selection processes for major exposures: This is the analysis of decisions that create new sources of potential losses to ensure that the risk being taken is justified by the expected reward. For example:

    • Risk/return analysis of major investments technology, new products, and so on

    • Selection of loan risks for banks; accounts receivable risks for other types of firms

  • Insurance: This comes in dozens of specialized categories, but here are a few of the many general groups:

    • Insurance against loss of specific property and other assets, including fire, flood, and so on

    • Various liabilities, including product liability

    • Insurance for particular trades or transportation of goods, such as marine insurance or the launch of a communications satellite

    • Life insurance for key officers

    • Reinsurance, generally purchased by insurance companies, to help risks that may be concentrated in certain areas (e.g., hurricane insurance in Florida, earthquake insurance in California, etc.)

  • Contractual risk transfer: Business contracts include various clauses such as "X agrees the company is not responsible for Y," including contracts with suppliers, customers, employees, partners, or other stakeholders.

  • Operational risk reduction: This includes everything a firm might do internally through management initiatives to reduce risks, including the following:

    • Safety procedures

    • Training

    • Security procedures and systems

    • Emergency/contingency planning

    • Investments in redundant and/or high-reliability processes, such as multiple IT operations sites, new security systems, and so on

    • Organizational structures or roles defining clear responsibilities for and authority over certain types of risks (a shift safety officer, a chief information security officer, etc.)

  • Liquid asset position: This is the approach to addressing the retention of risk but still attempting to absorb some consequences by using liquid reserves (i.e., cash, some inventory, etc.) to ensure losses would not be ruinous to the firm.

  • Compliance remediation: This is not so much its own category of risk mitigation because it can involve any combination of the previously mentioned items. But it is worth mentioning simply because it is a key driver for so much of current risk mitigation. This is, in part, a matter of "crossing the t's and dotting the i's" in the growing volume of regulatory requirements.

  • Legal structure: This is the classic example of limiting liability of owners by creating a corporation. But risk mitigation can take this further even for existing firms by compartmentalizing various risks into separate corporate entities as subsidiaries, or for even more effective insulation from legal liability, as completely independent spin-offs.

  • Activism: This is probably the rarest form of risk mitigation because it is practical for relatively few firms, but it is important. Successful efforts to limit liabilities for companies in certain industries have been won by advocating new legislation. Examples are the Private Securities Litigation Reform Act of 1995, which limits damage claims against securities firms; Michigan's 1996 FDA Defense law, which limits product liability for drugs that were approved by the FDA; and the Digital Millennium Copyright Act of 1998, which limits the liability of firms that provide a conduit for the transmission of data from damages that may be caused by the sources of the data.

As always, an informed risk mitigation starts with an identification and then some kind of assessment of risks. Once a risk manager knows what the risks are, steps can be taken to address them in some way. It might seem that some extremely obvious risks can be managed without much of an assessment effort (e.g., implementing full backup and recovery at a data center that doesn't have it, installing security systems at a major jewelry store, etc.). But in most environments, there are numerous risks, each with one or more potential risk mitigation strategies and a limited number of resources. We have to assess not only the initial risks but also how much the risk would change if various precautions were taken. Then those risk mitigation efforts, once chosen, have to be monitored in the same fashion and the risk management cycle can begin again (see [exhibit 11.3]). Notice that the assessment of risks appears prior to and as part of the selection of risk mitigation methods.

Now, getting this far should be an improvement over unaided gut feel and a big improvement over methods such as the risk matrix or qualitative scoring methods. Of course, this approach still makes many and big, simplifying assumptions. In #Chapter 12 Improving The Model, we will review some issues worth considering when you are ready to add more realism to the model.

Notes

Chapter 12: Improving The Model

Empirical Inputs

Adding Detail To The Model

Advanced Methods For Improving Expert's Subjective Estimates

Other Monte Carlo Tools

Self-Examinations For Modelers

Notes

Chapter 13: The Risk Community: Intra- And Extra-Organizational Issues Of Risk Management

Getting Organized

Managing The Model

Incentives For A Calibrated Culture

Extraorganizational Issues: Solutions Beyond Your Office Building

Growing the Profession

Of all the professions in risk management, that of the actuary is the only one that is actually a legally recognized profession. Becoming an actuary requires a demonstration of proficiency through several standardized tests. It also means adopting a code of professional ethics enforced by some licensing body. When actuaries sign their name to the Statement of Actuarial Opinion of an insurance company, they put their license on the line. As with doctors and lawyers, if they lose their license, they cannot just get another job next door. The industry of modelers of uncertainties outside of insurance could benefit greatly from this level of professional standards.

Standards organizations, government affiliated and otherwise, have always been a key part of what makes a profession a profession. But standards organizations such as PMI, NIST, and others are all guilty of explicitly promoting the ineffectual methods previously debunked. The scoring methods developed by these institutions should be disposed of altogether. These organizations should stay out of the business of designing risk analysis methods until they begin to involve people with quantitative decision analysis backgrounds in their standards-development process. Professionals should take charge of the direction their profession evolves by insisting the standards move in this direction.

Practical Observations From Trustmark

Final Thoughts On Quantitative Models And Better Decisions

Notes

Appendix: Additional Calibration Tests And Answers

Calibration Test for Ranges: A

  1. How many feet tall is the Hoover Dam?
  2. How many inches long is a $20 bill?
  3. What percentage of aluminum is recycled in the United States?
  4. When was Elvis Presley born?
  5. What percentage of the atmosphere is oxygen by weight?
  6. What is the latitude of New Orleans? [Hint: Latitude is 0 degrees at the equator and 90 degrees at the North Pole.]
  7. In 1913, the United States military owned how many airplanes?
  8. The first European printing press was invented in what year?
  9. What percentage of all electricity consumed in US households was used by kitchen appliances in 2001?
  10. How many miles tall is Mount Everest?
  11. How long is Iraq's border with Iran in kilometers?
  12. How many miles long is the Nile?
  13. In what year was Harvard founded?
  14. What is the wingspan (in feet) of a Boeing 747 jumbo jet?
  15. How many soldiers were in a Roman legion?
  16. What is the average temperature of the abyssal zone (where the oceans are more than 6,500 feet deep) in degrees F?
  17. How many feet long is the Space Shuttle Orbiter (excluding the external tank)?
  18. In what year did Jules Verne publish 20,000 Leagues Under the Sea?
  19. How wide is the goal in field hockey (in feet)?
  20. The Roman Coliseum held how many spectators?

Answers to Calibration Test for Ranges: A

  1. 726 feet
  2. 63/16ths (6.1417) inches
  3. 45 percent
  4. 1935
  5. 21 percent
  6. 29.95
  7. 23
  8. 1450
  9. 26.7 percent
  10. 5.5 miles
  11. 1,458 kilometers
  12. 4,160 miles
  13. 1636
  14. 196 feet
  15. 6,000
  16. 39 F
  17. 122 feet
  18. 1870
  19. 12 feet
  20. 50,000

Calibration Test for Ranges: B

  1. The first probe to land on Mars, Viking 1, landed there in what year?
  2. How old was the youngest person to fly into space?
  3. How many meters tall is the Sears Tower?
  4. What was the maximum altitude of the Breitling Orbiter 3, the first balloon to circumnavigate the globe, in miles?
  5. On average, what percentage of the total software development project effort is spent in design?
  6. How many people were permanently evacuated after the Chernobyl nuclear power plant accident?
  7. How many feet long were the largest airships?
  8. How many miles is the flying distance from San Francisco to Honolulu?
  9. The fastest bird, the falcon, can fly at a speed of how many miles per hour in a dive?
  10. In what year was the double helix structure of DNA discovered?
  11. How many yards wide is a football field?
  12. What was the percentage growth in internet hosts from 1996 to 1997?
  13. How many calories are in 8 ounces of orange juice?
  14. How fast would you have to travel at sea level to break the sound barrier (in mph)?
  15. How many years was Nelson Mandela in prison?
  16. What is the average daily calorie intake in developed countries?
  17. In 1994, how many nations were members of the United Nations?
  18. The Audubon Society was formed in the United States in what year?
  19. How many feet high is the world's highest waterfall (Angel Falls, Venezuela)?
  20. How deep beneath the sea was the Titanic found (in miles)?

Answers to Calibration Test for Ranges: B

  1. 1976
  2. 26
  3. 443 meters
  4. 6.9 miles
  5. 20 percent
  6. 350,000
  7. 803 feet
  8. 2,394 miles
  9. 200 mph
  10. 1953
  11. 53.3 yards
  12. 70 percent
  13. 120
  14. 760 mph
  15. 27
  16. 3,300 calories
  17. 184
  18. 1905
  19. 3,212 feet
  20. 3.36 miles

Calibration Test for Binary: A

  1. The Lincoln Highway was the first paved road in the United States, and it ran from Chicago to San Francisco.
  2. The Silk Road joined the two ancient kingdoms of China and Afghanistan.
  3. More American homes have microwaves than telephones.
  4. Doric is an architectural term for a shape of roof.
  5. The World Tourism Organization predicts that Europe will still be the most popular tourist destination in 2020.
  6. Germany was the second country to develop atomic weapons.
  7. A hockey puck will fit in a golf hole.
  8. The Sioux were one of the Plains Indian tribes.
  9. To a physicist, plasma is a type of rock.
  10. The Hundred Years' War was actually more than a century long.
  11. Most of the fresh water on Earth is in the polar ice caps.
  12. The Academy Awards ("Oscars") began over a century ago.
  13. There are fewer than two hundred billionaires in the world.
  14. In Excel, ^ means "take to the power of."
  15. The average annual salary of airline captains is over $150,000.
  16. By 1997, Bill Gates was worth more than $10 billion.
  17. Cannons were used in European warfare by the eleventh century.
  18. Anchorage is the capital of Alaska.
  19. Washington, Jefferson, Lincoln, and Grant are the four presidents whose heads are sculpted into Mount Rushmore.
  20. John Wiley & Sons is not the largest book publisher.

Answers for Calibration Test Binary: A

  1. FALSE
  2. FALSE
  3. FALSE
  4. FALSE
  5. TRUE
  6. FALSE
  7. TRUE
  8. TRUE
  9. FALSE
  10. TRUE
  11. TRUE
  12. FALSE
  13. FALSE
  14. TRUE
  15. FALSE
  16. TRUE
  17. FALSE
  18. FALSE
  19. FALSE
  20. TRUE

Calibration Test for Binary: B

  1. Jupiter's "Great Red Spot" is larger than Earth.
  2. The Brooklyn Dodgers' name was short for "trolley car dodgers."
  3. Hypersonic is faster than subsonic.
  4. A polygon is three-dimensional and a polyhedron is two-dimensional.
  5. A 1-watt electric motor produces 1 horsepower.
  6. Chicago is more populous than Boston.
  7. In 2005, WalMart sales dropped below $100 billion.
  8. Post-it Notes were invented by 3M.
  9. Alfred Nobel, whose fortune endows the Nobel Peace Prize, made his fortune in oil and explosives.
  10. A BTU is a measure of heat.
  11. The winner of the first Indianapolis 500 clocked an average speed of under 100 mph.
  12. Microsoft has more employees than IBM.
  13. Romania borders Hungary.
  14. Idaho is larger (in area) than Iraq.
  15. Casablanca is on the African continent.
  16. The first manmade plastic was invented in the nineteenth century.
  17. A chamois is an alpine animal.
  18. The base of a pyramid is in the shape of a square.
  19. Stonehenge is located on the main British island.
  20. Computer processors double in power every three months or less.

Answers for Calibration Test Binary: B

  1. TRUE
  2. TRUE
  3. TRUE
  4. FALSE
  5. FALSE
  6. TRUE
  7. FALSE
  8. TRUE
  9. TRUE
  10. TRUE
  11. TRUE
  12. FALSE
  13. TRUE
  14. FALSE
  15. TRUE
  16. TRUE
  17. TRUE
  18. TRUE
  19. TRUE
  20. FALSE

  1. R. Feynman, "Personal Observations on the Reliability of the Shuttle," Appendix IIF. In William Rogers et al., Space Shuttle Accident Report (Washington, DC: US GPO, 1986). ↩︎

  2. A. Koriat, S. Lichtenstein, and B. Fischhoff, "Reasons for Confidence," Journal of Experimental Psychology: Human Learning and Memory 6 (1980): 107--18. ↩︎

  3. P. Slovic, B. Fischhoff, S. Lichtenstein, Societal Risk Assessment: How Safe Is Safe Enough? (New York: Plenum Press, 1980). ↩︎

  4. The Columbia Accident Investigation Board Report, Vol. I (Washington, DC: US GPO, 2003), 121. ↩︎

  5. R. Dillon and C. Tinsley, "How Near-Misses Influence Decision Making under Risk: A Missed Opportunity for Learning," Management Science 54, vol. 8 (January 2008): 1425--40. ↩︎

  6. M. Schrage, "Daniel Kahneman: The Leader Interview," strategy + business, https://www.strategy-business.com/article/03409?gko=7a903%2031.12.2003. ↩︎

  7. M. Simkin and V. Roychowdhury, "Theory of Aces: Fame by Chance or Merit?," Journal of Mathematical Sociology 30, no. 1 (2006): 33--42. ↩︎