Files
zmVault/estimator-calibration.md
T

204 lines
5.6 KiB
Markdown

---
tags:
- topic/estimating
- topic/risk
- type/encyclopedia-entry
title: Estimator Calibration
---
# Estimator Calibration
Calibration is the process of learning to compensate for one's biases
in order to produce more accurate estimates.
Generally speaking, people tend to underestimate [[risk]]
and tend to be overconfident of their estimates.
> [!note] Confidence
> **Confidence** is an estimate of the accuracy of another estimate.
> To be "overconfident" is to consistently rate one's confidence
> above the observed accuracy of their estimates.
An estimator who is well calibrated
can properly account for such bias.
## Calibration Questions
Calibration is generally achieved
by having the estimator make many estimates,
then immediately observe their results.
This is repeated in rounds of question and review
until the desired results are achieved.
### Writing Good Calibration Questions
#### Ideal Difficulty
Calibration requires that the estimator have some confidence,
but not total certainty, in their response.
> [!failure] Bad
> T/F: When rolling 2 dice, a roll of 7 is more likely than a 3.
#### No "Trick" Questions
Questions should be unambiguously verifiable.
> [!failure] Bad
>
> > T/F: Any male pig is referred to as a hog.
>
> Referred to by whom?
> [!failure] Bad
>
> > T/F: In English, the word "quality" is more frequently used that the word "speed".
>
> Used more frequently where?
> [!success] Good
>
> > T/F: Pakistan shares a border with Russia
> [!tip]
> Definitions, terminology, and language are _always_ contentious,
> questions based on them always feel deceptive.
#### Phrasing
Interval "questions" should describe the quantity
rather than phrase it as a question.
> [!failure] Bad
> Q: How many gold medals did Jesse Owens win at the 1936 Berlin Olympics?
> [!success] Good
> Q: Number of gold medals won by Jesse Owens in the 1936 Berlin Olympics
### Strategy for Answering Calibration Questions
Confidence should never be less than probability of picking randomly
(50% for true)
### Examples
#### Boolean
> The melting point of tin is higher than the melting point of aluminum.
> California's giant sequoia trees are named for an early 19th century leader of the Cherokee Indians.
reductive
> The Model T was the first car produced by Henry Ford.
reductive (Henry Ford didn't produce cars)
> No one has ever been reported to have been hit by any object that fell from space.
reductive (reported by whom?)
> Sir Christopher Wren was a British anthropologist.
> Pakistan does not border Russia.
unnecessary negative form, otherwise good.
> The Navy won the first Army-Navy football game.
perfect.
> The paperback version of the book "The Da Vinci Code",
> as of July 2007, still ranks in the top 500 bestselling books on Amazon.
obtuse phrasing, dated topic, otherwise good
> Italian has more words than any other language.
reductive (what is a word? what dialect?)
> The month of August is named after a Greek god.
borderline facile, reductive
> The deepest ocean trench is deeper than the Grand Canyon.
facile
> Abraham Lincoln was the first president born in a log cabin.
deceptive phrasing
> As of July of 2007, more people search Google for "Harry Potter" than "Hillary Clinton"
> (according to GoogleTrends).
obtuse phrasing, dated topic, otherwise good
> The population of Alabama is higher than the population of Arizona.
borderline facile, deceptive phrasing
> No category 5 hurricane hit the US in 2004.
> The UK is among the top 10 largest economies in the world (by GDP).
> The movie Forest Gump has grossed more to date than E.T. The Extra Terrestrial.
obtuse phrasing, dated topic, otherwise good
#### Interval
> What percentage of bronze is typically made of copper?
As written, the correct answer is 100%.
All bronze alloys contain copper.
Being more generous,
there is no standard composition of bronze.
The subject can only guess at the interrogator's intent.
Average by weight produced?
Over what time frame?
In the U.S. or globally?
> How many countries have at least one McDonald's?
As of when?
> How many employees did eBay have in the first quarter of 2006
> What was the population of Miami (within the city limits, not the entire metropolitan area) in 1990?
> How many casualties did the French suffer in the Battle of Waterloo?
> What is the range in miles of a Minuteman Missile?
> What is the percentage of IT jobs in the US were unfilled in 1997?
> The Supremes' (with Diana Ross) song "Stop! In the Name of Love" was how long? (minutes, seconds)
> How many undergraduates attended Cambridge in 1990?
> If you could jump 50 feet straight up into the air, how many seconds would you be airborne before you landed?
> How many gallons are in a bushel (they are both measures of volume)?
I wonder if Hubbard had the same thought I did
while reading [[macgregor_1994_judgmental-decomposition|MacGregor et al. (1994)]].
> How many sovereign rulers has England had in the last thousand years?
> If the air temperature was 5 degrees below zero (Fahrenheit) and the wind speed was 15 mph, what would the temperature adjusted for wind-chill be?
> Average cost of testing in software development is what percentage of total project costs?
> On average, if a software development project was projected to take 17 months, it actually takes how many months?
> How many meters tall is the Sears Tower?
> How many gold medals did Jesse Owens win at the 1936 Berlin Olympics?
> In 2005, the average combined MPG for all US cars and light trucks on the road was how much?
> The average house in the United States uses how many gallons of water per day?
> What was the average price in the United States of a house sold in 2001?