Coins, Rain, and a COVID-19 Drug

Oct 21, 2021

It's hard to express statistical concepts in plain English. For instance, compare how the word "chance" is used in the following sentences:

1. When you toss a coin, there's a 50% chance of it landing heads up.

2. In Dallas tomorrow, between 6 and 10 a.m., there's a 50% chance of rain.

3. Molnupiravir reduces the chance of COVID-19 hospitalization by 50%.

What you see here are three different meanings of the word "chance". Each one is useful, but often misunderstood.

The Coin Toss

When we study probability in school, we hear a lot about people tossing coins, rolling dice, drawing cards from a deck, and so on. From an instructional perspective, these examples are useful, because they assume a tidy little universe where the probability calculations are never wrong. You can be sure of these calculations, because you know everything necessary to make them. Since a coin only has two sides, the chances of tossing it once and obtaining heads are always exactly 50%.

Or are they? Persi Diaconis, a Stanford mathematician, has shown that a tossed coin has a slightly higher chance of landing on whichever side was facing up when tossed (about 51%; the physics of this is complicated.) Also, a Lincoln penny spinning on a flat surface has a nearly 80% chance of landing tails up, owing to the greater heaviness of the side with Lincoln's head. (Diaconis demonstrated this with pre-2010 Lincoln pennies that have the Lincoln Memorial on the back. I'm not sure what we'd find with newer pennies that have the Union Shield design.)

Diaconis's work demonstrates that once you step out of the classroom, probability calculations lose some of their accuracy, because the world is messy, and you almost never have all the information you need. Even for something as apparently as simple as a coin toss, lots of real-word variables shift the chances of heads or tails slightly away from 50%. (These variables include the side facing up when tossed, differences in the natural weight of each side, differences in the amount of dirt accumulated on each side, the presence of beveled edges, and even methods of flipping. Diaconis, a former magician, has taught himself how to flip an ordinary coin and obtain heads or tails at a rate exceeding 80%.)

Here's the moral of this little tale: Probability stats grounded in the tidy universe of coin tosses have transformed our lives through their contributions to STEM, epidemiology, economics, education, etc., but we sometimes forget that they don't sufficiently reflect real-world conditions. Predicting coin toss outcomes is one of the simplest (and least consequential) illustrations of this limitation.

Forecasting Rain

When we talk about the outcome of a coin toss, the meaning of "chance" is clear; we're only led astray by the lack of attention to real-word conditions. In the case of weather forecasts, the opposite problem arises: The forecast is closely tied to real-world conditions, but people often misunderstand what "chance" means.

Consider again Sentence 2: "In Dallas tomorrow, between 6 and 10 a.m., there's a 50% chance of rain." What does that mean exactly? Should you bring an umbrella?

The first question is easy. Sentence 2 indicates a 50% chance of rain somewhere in Dallas between 6 and 10 tomorrow morning. ("Rain" means at least 1/100th of an inch of precipitation accumulating on a level surface.) In other words, Sentence 2 doesn't mean rain is expected 50% of the time, or that rain is expected in 50% of the city. It means there's a 50-50 chance that somewhere in the city, for some length of time, there could be rain. Maybe a light drizzle for a few minutes in one neighborhood. Or a steady downpour across the entire city.

As for whether you should bring an umbrella, we need to look at where that 50% stat came from. Briefly, the forecaster relied on complex atmospheric models to generate numbers such as the extent of confidence there will be rain somewhere in Dallas between 6 and 10 a.m., and the expected proportion of Dallas that will receive rain during this time. These two numbers are multiplied together in the process of generating the forecast. For example, the forecaster might've been 100% certain that 50% of Dallas will experience rain tomorrow. Converting these numbers to proportions, we get 1 x .5 = .5. In other words, a 50% chance of rain.

This is a crude approximation of the actual probability calculations, but it's enough to illustrate that the umbrella question is tricky, because the same forecast could mean different things. For instance, a 50% chance of rain will be be predicted under each of the following conditions:

A. 100% certainty that it will rain in 50% of the city.

B. 70% certainty that it will rain in 71.4% of the city.

C. 50% certainty that it will rain in 100% of the city.

What can you do with this kind of information? Well, you know at least that the chance of rain identifies the minimum percentage of the city that should experience rain (as in scenario A). In other words, a 50% chance of rain means that 50% or more of the city will experience rain. I suppose that's useful information, though it doesn't tell you how much rain or for how long. Again, you could get a momentary drizzle, or an ongoing torrent.

You can also see that the more extreme the chances of rain (either high or low values), the more useful the forecast. For example, I just read that tomorrow (October 22) there's a 3% chance of rain in Dallas between 8 and 10 p.m. If you happen to live in Dallas, I'll bet you a dollar you won't experience rain in your neighborhood during that time period. I say that because the forecast reflects anything from complete certainty that 3% of Dallas will experience rain, to 3% certainty that it will rain everywhere in Dallas. Either way, your "chance" of getting rained on seems low.

So, it should be clear now that the "chance" of rain is radically different from the "chance" of tossing a coin and getting heads. The chance of rain reflects both confidence in a prediction as well as the scope of that prediction. There's no sound bite or metaphor that captures this. If you want to know exactly what it a certain chance of rain means, you have to spell it out.

Statistical modeling and ever more powerful computers have contributed to the increasing accuracy of weather forecasts in recent decades. Although there's a venerable tradition of bitching about weather reports that get it wrong, studies show a high degree of accuracy, particularly when forecasts cover relatively short periods (e.g., 24 hours - see here and here).

Studies also show that lay people misunderstand what "chance" means in statements like "there's a 20% chance of rain tomorrow." Some people think it means that 20% of the target area will receive rain, while others think it means rain 20% of the time (or, similarly, that they have a 20% chance of getting rained on each time they step outside). These misconceptions treat the "chance" of rain as something similar to the "chance" of a particular outcome when flipping a coin.

Why do people get it wrong? Probably because nobody taught them how chance is defined in weather forecasts. But I think another factor reinforces these misconceptions: The coin-toss type of probability is what we learn first in school, and it remains the simplest way to think about probability stats. Thus, when we hear words like "chance", we're biased to interpret them in this simple way. Weather forecasts show us that what's meant by "chance" may be much more complicated.

A New COVID-19 Drug

At the beginning of this month, Merck reported that its experimental antiviral drug, Molnupiravir, reduces the chances of hospitalization among COVID-19 patients by roughly 50%. In their Phase III trial, now under review by the FDA, 775 people with mild or moderate COVID-19 symptoms were randomly assigned to take Molnupiravir or a placebo for 5 days. One of the main findings was that compared to the Molnupiravir group, almost twice as many placebo group members ended up hospitalized. Thus, both Merck and media reports have noted that Molnupiravir reduces the "risk", or "chance" of COVID-19 hospitalization by about 50%.

You can probably tell that this 50% stat is very different from a 50% chance of rain, or a 50% chance of flipping a coin and getting heads. A key distinction is that Merck's 50% is a relative value (specifically, the ratio of hospitalized to non-hospitalized individuals). Regardless of the absolute numbers of COVID-19 patients who become hospitalized, those who take Molnupiravir are half as likely to be hospitalized as those who don't take the drug.

Relative values are useful, insofar as they stay constant when absolute values change. At the same time, people sometimes misunderstand them. Some people are already claiming that the absolute difference in rate of hospitalizations between COVID-19 patients who do vs. don't take Molnupiravir is 50%. This is not correct. The absolute difference is actually about 7%. That is, 14.1% of Merck's placebo group were eventually hospitalized, versus 7.3% of the Molnupiravir group. This roughly 7% absolute difference between groups in hospitalizations corresponds to a relative reduction of about 50% (because 7.3% is about half of 14.1%).

To summarize: We talk about "chance" in the context of events like coin tosses, weather forecasts, and group differences in hospitalizations, but the term means something different in each case. And, in each case, misconceptions arise from a different source. For coin tosses, the definition of "chance" is clear, but we may not realize that the formulae ignore real-world factors. For weather forecasts, we may not understand how "chance" is defined in the first place. For group differences in hospitalizations, "chance" is understandable but we may fail to properly distinguish absolute versus relative differences.

Thanks for reading!

Appendix: Two practical applications of this newsletter

1. If you want to earn some money, go to a bar, ask someone for a penny, make sure it's clean and was minted before 2010, and then bet that if you spin this penny 40 times, it will land tails up more often than heads up. Make sure to spin it thoroughly. You'll win your bet. Feel free to send me half of your earnings.

2. If you get a mild case of COVID-19 (I hope you don't!), think carefully before spending $700 on a full course of Molnupiravir. I do recommend you take the drug. But here's why you should take a moment to think it over:

Every participant in Merck's trial had at least one risk factor for severe COVID-19 (heart disease, diabetes, age > 60, etc.). If you have one of these risk factors, the chances of being hospitalized for COVID-19 are 14.1%, according to Merck's data, and only diminish to 7.3% if you take Molnupiravir. Based on these stats, it does seem prudent for at-risk people to take the drug. However, if you don't have any risk factors, the chances of being hospitalized for COVID-19 are much less than than 14%. Exactly how much less is impossible to know, because some people with COVID-19 are asymptomatic or have mild symptoms and don't get tested, but available evidence suggests that a 1% hospitalization rate would be a conservative estimate. If Molnupiravir works the same way for everyone, regardless of whether risk factors are present or not, then taking the drug would reduce that 1% chance of hospitalization to 0.5%. Is half of one percent less risk worth $700?

Statisfied

Coins, Rain, and a COVID-19 Drug