A Modern Guide to Thinking, Fast and Slow
Part III - Overconfidence
- The Illusion of Understanding
- The Illusion of Validity
- Intuitions vs. Formulas
- Expert Intuition: When Can We Trust It?
- The Outside View
- The Engine of Capitalism
Chapter 19: The Illusion of Understanding
Overview
We like stories that make the past look straightforward and inevitable. After we know an outcome, we treat it as if it should have been predictable, and give too much credit to skill while underemphasizing the role of luck. The hindsight bias can make us overconfident about how much we can predict.
Replications and Reliability
The Nixon study Fischhoff & Beyth, 1975 and the Minnesota flood study Kamin & Rachlinski, 1995 are both tied to specific historical events, so we wouldn’t expect exact replications. However, both are illustrations of hindsight bias, which is a well-studied effect. A 2004 meta-analysis found a moderate, reliably nonzero effect Guilbault et al., 2004. A 2012 paper detailed how new technologies for visualizing data can heighten the bias Roese & Vohs, 2012.
Recommendation
Treat this chapter as a general warning about hindsight bias and a lesson on noticing how knowing the outcome distorts memory and judgment.
Chapter 20: The Illusion of Validity
Overview
Much of what looks like skill in stock picking and expert prediction is an illusion. Individuals who trade frequently tend to underperform and professional stock-pickers and political forecasters tend to do no better than chance. Even in the face of this evidence, people remain highly confident in their judgments.
Replications & Reliability
- Barber & Odean’s paper showing that active trading is hazardous to your wealth Barber & Odean, 2000 is not a lab effect that needs an exact replication. Multiple markets show the same pattern. A follow-up study that looked at the complete trading history of investors in Taiwan found that "individual investor trading results in systematic and economically large losses" (Barber et al., 2008).
- Men trade more than women (Barber & Odean, 2001): The general claim that men trade substantially more than women and, after costs, earn lower net risk-adjusted returns is supported. According to XTB, men trade nine times more than women. A 2019 experimental study found that men and women trade at different rates; however, it did not support overconfidence as the cause (Cueva et al., 2019).
- The claim that wealth advisors do no better than chance is also supported. Fama and French found that "The aggregate portfolio of actively managed U.S. equity mutual funds is close to the market portfolio, but the high costs of active management show up intact as lower returns to investors" (Fama & French, 2010). SPIVA scorecards show that few funds beat their benchmarks over the long run, and those that do rarely stay on top.
- Tetlock's 20-year project showing that political experts' predictions were no better than chance is methodologically robust (Tetlock, 2005). However, later Intelligence Community forecasting tournaments showed that when top performers are identified and placed on "superforecaster" teams with training, collaboration, and strong incentives, they deliver consistently higher accuracy across hundreds of geopolitical questions for multiple years (Mellers et al., 2015).
Recommendation
Much of the chapter holds up well: frequent trading underperforms, most active funds still lag behind the market after fees, and expert political forecasts are weak. However, it's worth noting that there are genuinely good forecasters, and while men reliably trade more than women, overconfidence may not be the sole (or even primary) explanation for that gap.
Chapter 21: Intuitions vs. Formulas
Overview
Across various domains, simple formulas often lead to more accurate predictions than clinical judgment.
Replications & Reliability
- Meehl's finding that statistical predictions beat clinical ones across domains is robust (Meehl, 1954). It has been supported by multiple meta-analyses (Grove et al., 2000, Ægisdóttir et al., 2006, and Kuncel et al., 2013).
- Ashenfelter's weather-only Bordeaux wine value forecasting model can be considered reliable due to its predictive accuracy (Ashenfelter, 2008).
- Radiology inconsistency: Experienced readers show substantial intra-/inter-observer variability on the same images; this has been documented by systematic reviews, such as Brady, 2016 and Schmid et al., 2021.
- Dawes' observation that complex statistical models add little to no value over simple ones: This has been supported by a review of 97 studies across 32 papers, which found that "none of the papers provide a balance of evidence that complexity improves forecast accuracy" (Green & Armstrong, 2015).
- The Apgar score is still a widely-used and accepted tool in determining the health of newborns. Many follow-up studies, including a 2020 study that looked at 113,300 preterm infants, support its accuracy (Cnattingius et al., 2020); however, its limitations should be acknowledged (see Watterberg et al., 2015).
Recommendation
This chapter is strong: Meehl's claim that mechanical rules beat clinical judgment is supported by multiple meta-analyses, Ashenfelter’s weather-only Bordeaux model shows predictive accuracy, inconsistencies in radiologist evaluations are well-documented, and the Apgar score remains a valuable, widely-used checklist. The key message holds up: Across fields, simple statistical rules regularly make better predictions than professionals relying on their intuition.
Chapter 22: Expert Intuition: When Can We Trust It?
Overview
Though the previous chapter showed that formulas often beat clinical judgment, this chapter demonstrates that expert intuition can be worth trusting when the world offers stable, learnable patterns, especially when feedback is fast and clear.
Replications & Reliability
The major source cited in this chapter is "Conditions for Intuitive Expertise: A Failure to Disagree" (Kahneman & Klein, 2009). It is not an experiment but an adversarial synthesis that reconciles two research traditions and proposes when expert intuition is likely to be reliable. Because it reports no new data or effect sizes, there is nothing to replicate in the usual sense; the claim should be judged by whether it predicts real-world performance across tasks. Forecasting tournaments, which identify "superforecasters" provide nuance: While geopolitics is a low-regularity domain where unaided expert intuition typically performs poorly, structure, frequent scoring, and training can still improve accuracy. That supports the framework’s emphasis on learning and feedback. In low-regularity settings, meta-analyses continue to find that mechanical combination outperforms expert intuitive judgment (see previous section).
Recommendation
This chapter adds important balance to the previous one. The point that intuition is worth heeding when cues are stable and feedback is fast and clear helps prevent readers from overgeneralizing to "intuition is always bad."
Chapter 23: The Outside View
Overview
We tend to plan projects from the inside view, focusing on our specific plan and current progress. That often produces best-case estimates and leads to the planning fallacy: overly optimistic timelines and risk judgments. Instead, we should base estimates first on the outside view—what usually happens in similar projects (the base rate)—then adjust for the specifics of the case.
Replications & Reliability
This chapter relies on an autobiographical case and archival/project data rather than lab experiments, but the underlying claims about the planning fallacy are well supported. A chapter in Advances in Experimental Social Psychology called "The Planning Fallacy" documents many examples of it (Buehler, et al., 2010).
The "2002 homeowners’ kitchen remodel" story is a widely repeated anecdote originating from a report in Remodeling Magazine. I could not locate the original report and am unsure of its reliability.
Recommendation
The planning fallacy is well documented. The recommendation to start with base rates and then make modest adjustments is useful practical advice to avoid overly optimistic predictions. Treat the 2002 kitchen remodel story as a helpful illustration only.
Chapter 24: The Engine of Capitalism
Overview
Optimism bias—seeing the world as kinder than it is, ourselves as more capable than we are, and our plans as more achievable than they are—powers the engine of capitalism since it motivates founders and executives to take bold risks. However, it often leads to losses. When we predict outcomes based on skill alone while discounting the role of luck and competitors' actions, we neglect base rates and fall into the illusion of control. Optimism is a gift in that it fuels persistence and action, but it can lead to costly mistakes.
Replications & Reliability
- The research showing that 47% of inventors continued with their projects even after they were given a highly reliable prediction of failure is field evidence, so a useful follow-up would not be a lab replication Åstebro & Gerchak, 2001. A re-analysis on new programs with preregistered methods would be interesting, but I'm not aware of any.
- The finding that mergers undertaken by highly optimistic CEOs resulted in more negative market reactions Malmendier & Tate, 2007 has no direct follow-ups; however, a recent meta-analysis found that CEO overconfidence is on average slightly beneficial for firm performance, partly because it drives bolder, higher-variance strategies that sometimes pay off (Burkhard et al., 2022).
- The Duke CFO survey showing that CFOs had no ability to predict the market is robust: it leverages a large 10-year (40-quarter) panel of 13,300+ forecasts (Ben-David et al., originally published in 2010). A follow-up that looked at 28,400 S&P 500 return forecasts reached the same conclusion Ben-David et al., 2024.
- The claim that the Royal Dutch Shell geologists became less overconfident in their assessments of possible drilling sites after learning about multiple past cases comes a 1992 study (Russo & Schoemaker, 1992). While it's not a randomized controlled trial, it’s credible field evidence of case-based training helping to reduce overconfidence. A more recent study showed that calibration training helped reduce the bias of intelligence analysts, though the training did not rely on case studies (2024 study).
Recommendation
This chapter's overall message is reliable: optimism and overconfidence show up in various settings, and bias can be reduced with training. Some claims are context-bound or mixed: CEO overconfidence may be more beneficial than harmful. Treat the Shell story as an example of how training can help to reduce bias, but note that learning about past cases isn't the only way.