Simple Heuristics That Make Us Smart – A Book Review

Quick Take: Simple Heuristics That Make Us Smart is a collection of academia based essays proving the comparative value of decision making based on good enough information. The examples and anecdotes are good, but there is complex math to wade through. It isn’t a leisure read. However, each section can be consumed on it’s own. If you’re a student of decision making, whether it’s group dynamics or individual situations, then this book is a good heuristics reference.

Simple Heuristics that Make Us Smart Cover

Detail Review: Many of us have a comfortable chair which serves as our place to relax.
Its great for 40 winks. But why do we relish peacefully falling asleep
in a chair? Most of the time it’s because we are mentally exhausted. Everyday we are faced with an ever changing list of choices to make and each has a list of known variables and all kinds of factors which are unknown. We try to streamline choices that have worked so we don’t need to concentrate on it. I take the same route to work everyday even though there are probably another ten ways to get there, for instance.

I wish I had a computer in my head to compute all the different inputs into making a decision. I could continually collect data and analyze it practically to a 100% decision certainty. But I don’t have a computer or unlimited time, instead I rely on heuristics. Heuristics are simple methods for using particular cues and constraints to make a choice. Gerd Gigerenzer, Peter M. Todd, and The ABC Research Group authored this tome as a study of how accurate specific heuristics are.

Here are a few heuristics covered in the book:

Recognition
Definition – If one of two objects is recognized and the other is not, then infer that the recognized object has the higher value with respect to the criterion.
Example – If I ask 100 Americans which city in Germany is more populated Berlin or Saarburg? The results will be close to 100% correct – Berlin is more populated. Of the 100 people few, if any, will recognize Saarburg as a city, but practically all of them will have heard of Berlin. Because of that recognition they will answer Berlin even though they know little about the actual number of people who live in either city.

Take the Best
Definition – When making a judgment based on multiple cues, the criterion are tried one at a time according to their cue validity, and a decision is made based on the first criterion which discriminates between the alternatives.
Example – Suppose we ask the question about population again, but instead of Saarburg we use Frankfurt. Berlin and Frankfurt are both recognizable so we must use other reasons to discriminate population. We pose a list of usual indicators of large populations – historical relevance, it’s a capital, tourism, sports teams, and so on. From that list we rank the list based on which ones usually are more of an indication of population and try to separate the two. We compare Frankfurt and Berlin for tourism and realize that Berlin is much more of a destination than Frankfurt is. We stop there and don’t review the other reasons. We take the best separator – tourism – and decide to invest no more time in evaluating. Berlin is the answer.

Take the Last
Definition – When making a judgment based on multiple cues, the criterion are sorted according to what worked last time. It uses memory of prior problem solving instances and works from what was successful before.
Example – I’m now comparing Frankfurt and Munich in population. I’ve heard of both so I can’t use Recognition. I use Tourism as the candidate since it worked with Berlin and Frankfurt. This time I go with Munich because they’ve hosted an Olympics and is more of a destination than Frankfurt. This answer is correct and time and energy was saved because I didn’t need to sort through all the other criteria.

In addition to those there are:

Franklin’s Rule – calculates for each alternative the sum of the cue values multiplied by the corresponding cue weights (validaties) and selects the alternative with the highest score.
Dawes’s Rule – calculates for each alternative the sum of the cue values (multiplied by a unit weight of 1) and selects the alternative with the highest score.
Good Features (Alba & Marmorstein, 1987) selects the alternative with the highest number of good features. A good feature is a cue value that exceeds a specified cutoff.
Weighted Pros (Huber, 1979) selects the alternative with the highest sum of weighted “pros.” A cue that has a higher value for one alternative than for the others is considered a pro for this alternative. The weight of each pro is defined by the validity of the particular cue.
LEX or lexicographic (Fishburn, 1974) selects the alternative with the highest cue value on the cue with the highest validity. If more than one alternative has the same highest cue value, then for these alternatives the cue with the second highest validity is considered, and so on. Lex is a generalization of Take the Best
EBA or Elimination by Aspects (Tsersky, 1972) eliminates all alternatives that do not exceed a specified value on the first cue examined. If more than one alternative remains, another cue is selected. This procedure is repeated until only one alternative is left. Each cue is selected with a probability proportional to its weight. In contrast to this probabilistic selection, in the present chapter the order in which EBA examines cues to determine by their validity, so that in every case the cue with the highest validity is used first.
Multiple Regression is a statistically analysis of how the typical value of the dependent variable changes when any one of the independent variables is varied, while the other independent variables are held fixed. This is beyond the capacity of a normal human and usually requires a resources like a computer.

The book uses the city example to run a test against a few heuristics and Regression testing (computing intensive). The results are startling when you consider the number of cues needed to reach the decision (a low number for Take the Best and Take the Last and a high number for the other three).

Here’s a chart showing relative performance for this particular case study:

As you can see, Take the Best and Regression Analysis are very similar in performance. This means if you pick the right Heuristic to use for the situation you can save time and resources and still get the performance that is comparable for the trade off (time and energy).

So what does this mean? Sometimes it’s the difference between life and death.

A
man is rushed to a hospital in the throes of a heart attack. The doctor
needs to decide quickly whether the victim should be treated as a
low-risk or a high-risk patient. He is at high risk if his life is
truly threatened, and should receive the most expensive and detailed
care. Although this decision can save or a cost a life, the doctor does
not have the luxury of extensive deliberation: She or he must decide
under time pressure using only the available cues, each of which is,
at best, merely an uncertain predictor of the patient’s risk level. For
instance, at the University of California, San Diego Medical Center, as
many as 19 such cues, including blood pressure and age, are measured as
soon as a heart attack patient is admitted. Common sense dictates that
the best way to make the decision is to look at the results of each of
those measurements, rank them according to their importance, and
combine them somehow in to a final conclusion, preferable using some
fancy statistical software package.

Consider in contrast the simple decision tree below, which was designed
by Breiman and colleagues to classify heart attack patients according
to risk using only a maximum of three variables. A patient who has a
systolic blood pressure of less than 91 is immediately classified as
high risk – no further information is needed. Otherwise, the decision
is left to the second cue, age. A patient under 62.5 years old is
classified as low risk; if he or she is older, the one more cue (sinus
tachycardia) is needed to classify the patient as high or low risk.
Thus, the tree requires the doctor to answer a maximum of three yes/no
questions to reach a decision rather than to measure and consider 19
predicators, letting life-saving treatment proceed sooner.

To wrap up, the book has many interesting essays as chapters, ranging from bicycle races, hindsight bias, ants, mate selection, and bargaining. It’s a solid 365 pages with small font. The math and the science can be dense, but the applicability of the results are real. It doesn’t sugar coat what goes into making heuristics worthwhile – a lot of up front analysis. It does however show how powerful those paths or decision trees can be once they are implemented.

Gerd Gigerenzer has other books that are probably more digestible for the heuristically curious (Gut Feelings: The Intelligence of the Unconscious and Calculated Risks: How to Know When Numbers Deceive You) but if you’re into behavior and why particular decision paths are more economical than others, then this book is a good educational read.

Other Reviews:
The IBM Data Governance Unified Process: Driving Business Value with IBM Software and Best Practices – A Book Review / How Pleasure Works – A Book Review / Why We Make Mistakes – A Book Review / Drive: The Surprising Truth About What Motivates US – A Book Review / Rules of Thumb – A Review / I Hate People – A Review / The Job Coach for Young Professionals – A Review / A Review of The Fearless Fish Out of Water: How to Succeed When You’re the Only One Like You / A Quick Review of Johnny Bunko (a manga story)

Simple Heuristics That Make Us Smart – A Book Review

Share this:

Leave a comment Cancel reply