Book summary: How not to be wrong by Jordan Ellenberg
The book focuses on the application of simple and profound maths to day-to-day life and how not to be deceived by mathematical traps.
Dividing one number by another is mere computation. Figuring out what to divide is mathematics.
The missing bullet holes
During the second world war, Americans were trying to decide where to add more armor to the plane. They could not add armor everywhere since it causes more consumption of the fuel. Airforce concluded that the armor should go where the most bullets hit the plane. For that, they used the damaged planes returning to the base. Abraham Wald, a mathematician claimed the opposite. He concluded that since fewer planes with damaged engines are returning, therefore, more armor should be applied there since the planes with damaged engines don’t survive. We still see this survivorship bias, a study concluded that Large blend funds grew by 10.8% annually between 1995 to 2004 but once you take the dead funds, the rate falls to a more realistic 8.9% year over year.
Lesson: Challenge the assumptions, especially, survivorship bias.
People mistakenly (implicitly) assume that all lines are straight. If the obesity increased 1% last year than the same will happen over the coming years. That’s what a paper did, which concluded that 100% Americans will be obese by 2048, applying false linearity to the same data simultaneously concluded only 80% black men will be obese at that time.
Lesson: Don’t assume that the rate of change of a quantity will remain same over time. If it’s linear it will but that assumption should be proved.
Noise in small data sets
Small datasets have more noise than signal. When brain cancer per capita numbers are listed for the 50 states of the US than South Dakota ranks among highest while North Dakota ranks among the least. Same happened to the North Carolina school testing system where, as a paper concluded, small schools ended up being best or the worst more often than the bigger ones since a few extremes (prodigies or slackers) caused a wide average swing.
Lesson: When looking at the conclusion based out of a data set, do see how resistant that dataset is to noise.
Percentages of Negative numbers
Consider a decade, where technology sector gained 2 million jobs, finance gained 0.6 million, manufacturing lost 2 million jobs. One can add the numbers up and claim that net job gain is 0.6 million, therefore, finance job growth is 100% of the total job growth!!!
Negative numbers don’t really play with percentages. Computation works but mathematical interpretations are incorrect. Percentages are fine for expenses, population, and similar quantities which are usually positive but not for quantities like profits or number of jobs which vary between positive and negative values.
Lesson: Percentages should not be used with dataset involving negative numbers.
Patterns created by noise
Humans have an uncanny ability to see patterns generated purely by noise (ashishb’s note: See “Fooled by Randomness“). In 2009, a published paper demonstrated that even a dead salmon’s brain activity (noise) can correlate to human emotions provided you divide the brain into sufficient parts and have a wiggle room to choose a particular part after recording the activity. Large random datasets are bound to exhibit some patterns generated by noise. Similar mistakes have been done by Bible code believers who found patterns by selectively taking every fourth letter of certain passages to generate predictions. Financial companies incubate funds internally and after a few years, close the failed once and open the successful ones to public investment. Basketball fans believe in hot hands even though there is no statistical basis for them.
Human beings are quick to perceive patterns where they don’t exist. One way to be certain is the concept of p-value, first, define the random outcome (“null hypothesis”) for a large sample set, perform the experiment and if the outcome had very low likelihood (p-value) of happening than the pattern exists. Generally, p-value below 0.05 (5%) is considered good or “significant”. Do note that, p below 0.05 can always by chance too and that’s what leads to p-hacking in medical papers.
Expected value is the average value of a random over a large number of trials. MIT students figured out that the state lottery ticket had a higher expected value than its price and made profits on it.
Risk vs Uncertainty
Risk is quantifiable, uncertainty is not. If an urn has 30 red balls out of 90 with rest being yellow and black, and you pull a ball out, the risk of not being red is 2/3 but the chance of pulling a black ball is unquantifiable. Decision and utility theory can work on risks. People prefer risks over uncertain scenarios.
Regression to Mean
Any outcome impacted by chance regresses to mean. For example, really tall parents have kids who are usually not as tall as them. Best businesses of an era which are also lucky lose their luck over time and regress towards the mediocrity.
Correlation is not transitive
Correlation is like a blood-relation. A father is related to his son by blood, the son is related to the mother by blood but that it is wrong to conclude that the father-mother duo is related to each other by blood. Niacin (Vitamin B) is correlated with higher HDL “good cholesterol”, higher HDL is correlated with better heart health but taking Niacin has no noticeable impact on heart health. In fact, net correlation can be negative. In women, while higher estrogen levels correlate to lower risk of the heart diseases, hormone replacement therapy with estrogen and progestin increases the risk of heart diseases. Rich Americans are correlated with rich states and rich states are correlated with voting for the Democrats but rich Americans, on the whole, do not vote for Democrats.
In women, while higher estrogen levels correlate to lower risk of the heart diseases, hormone replacement therapy with estrogen and progestin increases the risk of heart diseases.
Uncorrelated does not imply unrelated
Uncorrelated variables simply do not have the relationship that the correlation would have implied. They can still have other relations. A study designed to see do more informed American voters vote for Democrats or Republicans concluded that there is no correlation between being informed and voting for Democrat or Republican, which is correct. Except it turns out that the more informed voters were more polarized in their beliefs, uncorrelated but related.
Berkson’s fallacy demonstrates that the spurious correlation between independent events can appear due to the bias in a study. For example, if 30% people suffer from diabetes and 40% suffer from heart diseases and they all end up in the hospital than a statistical analysis at the hospital will conclude that diabetes is negatively correlated with heart diseases even when no such correlation exists.
Polling rarely works when there are more than two options. Anything short of a simple majority in favor of one outcome generates a net statistically incoherent result. Whenever polled most Americans prefer smaller government than more taxes except when asked which social programs to cut, there is no consensus. The most extreme form of this is Condorcet Paradox where among three candidates, people prefer A over B, B over C, and C over A. Some systems like Australia’s instant run-off voting system tries to solve this by asking for a list of preferences except this kind of system can lead to a loss for a centrist candidate who is no one’s first preference. This is what happened in 2009 Burlington, Vermont elections which used the instant run-off voting system.
How a new choice impacts our behavior
When we are choosing between two choices A and B, a new less preferred choice C can tilt the balance between our preference between A and B, it is known as asymmetric domination effect. The effect exists not just in humans but has been observed even in animals. (ashishb’s note: The best example of this, not mentioned in the book, is the web-only pricing of Economist being really as web + print pricing of Economist)
2 Replies to “Book summary: How not to be wrong by Jordan Ellenberg”
I got deja Vu of limitations of probabilitic theories and invocation of information bias
I wish to buy this book but unable to find it online. I dont mind buying a PDF version. Please help me find one. Thank you.