Anniversary Edition

May 19, 2022

Today is the first anniversary of this newsletter, so let me start by thanking you, dear reader, for your support throughout the year. In the Appendix, I describe my future plans for Statisfied, along with some exciting new projects.

This week, instead of focusing on just one topic, I briefly discuss three health-related items that made the news. I've chosen these items because they illustrate statistical themes I've touched on throughout the year, as well as some I'll get to in the coming months.

Mortality statistics

This week, the CDC announced that more than 107,600 Americans died of a drug overdose in 2021, a 15% increase over the previous year, and the highest annual number ever recorded. The majority of the fatalities were caused by fentanyl (71,238), followed by methamphetamine, cocaine, and prescription pain medications.

The CDC also announced this week that America reached – and exceeded – the tragic milestone of 1 million COVID-19 deaths.

We react to statistics like these in many ways. We mourn the victims. We malign drug companies and anti-vaxxers. We question the choices people make. But however we feel, we take the availability of the statistics for granted. That is, the actual numbers of drug overdoses or COVID-19 victims may be shocking, but the fact that the stats are reported doesn't seem noteworthy. And yet, population-level statistics as precise as these are relatively new, historically speaking. Technological developments such as computers may facilitate data collection, but we could not know what we know without statistical techniques developed during the 19th and 20th centuries.

In short, stats give us unprecedented quantitative knowledge about our fellow citizens. We know more about ourselves than humans have ever known. And, what we learn from stats goes beyond mere description. For example, stats help illuminate the causes and consequences of the opioid epidemic. Take West Virginia, which continues to lead the nation in opioid mortality rates. Studies have linked opioid use in West Virginia to its economy (it's among the four U.S. states with the highest poverty rates and lowest median annual incomes), to shifting unemployment rates, and to a relatively high rate of prescription opioid distribution, and the economic impact of opioid abuse in the state has been calculated to the dollar ($7,247 per capita last year – surely false precision, but still informative). What we know about West Virginia's opioid problem represents a level of detail that's unmatched in, say, 18th century accounts of alcoholism among European peasants and the impact on agricultural production.

One topic I haven't covered in these newsletters yet is the downside of population-level statistics. Governments have used, and continue to use, statistical data as a tool for social control. Some examples of this, like the use of census data to determine excessively burdensome taxation and conscription practices, are obvious. A famous example is China's one-child/one-family policy, implemented in 1980 owing to statistical evidence of overpopulation, then abandoned in 2015 owing to statistical evidence that the policy worked too well (i.e., the birth rate became too low to sustain the future economy). Statistics also contributes to governmental control in less obvious ways. I'm especially intrigued by facial-recognition algorithms, which rely on the application of several kinds of statistical procedures. (More on this in an upcoming newsletter.)

In sum, population-level statistics are double-edged sword. They support public health, for instance, but they can also used by governments to assist with social control.

Food and your brain

This week I read about study linking the consumption of pro-inflammatory foods to smaller brain volume and other markers of dementia.

To put it crudely, the study showed that eating the wrong foods shrinks your brain. (Does that sound scary? Read on.)

This study, published in the journal Alzheimer's & Dementia, looked at the self-reported dietary habits and brain health of 1,897 adults (mean age 62 years). The focus was not on foods per se but rather on 31 components of foods. Pro-inflammatory components included carbohydrates, cholesterol, iron, protein, saturated fat, and vitamin B12. Anti-inflammatory components included alcohol, caffeine, fiber, garlic, monounsaturated fat, tea, and numerous vitamins. Participants described their eating habits on surveys administered several times over the course of a decade, then underwent MRI scans.

After statistically controlling for a number of key variables (body size, health, gender, etc.), the researchers found that by the end of the decade, greater pro-inflammatory consumption was associated with smaller total gray matter volume. (Yikes.)

Fortunately, the study is so weak, methodologically speaking, I don't see any cause for alarm. Here are some of the more obvious limitations:

1. Flawed measurement.

Self-reported dietary habits are notoriously imprecise. People don't tend to recall very specifically what they ate or how much they consumed. In this particular study, people were asked to rate how often they typically eat certain foods on a scale ranging from never to < 1 time per day to > 6 times per day. You can imagine how imprecise this is when people characterize their eating habits over a period of years. (Ask yourself: How many times per day have you eaten bread on a typical day since 2019? How about beans? Etc.) The quantity of each serving wasn't considered, and the researchers had to estimate consumption of food components (protein, fats, cholesterol, vitamins etc.) based on descriptions of foods consumed (e.g., bread). And, some food components that affect inflammation could not be studied (e.g., turmeric, a known anti-inflammatory). In short, the approach to measurement was hopelessly flawed. This is exacerbated by a second major problem:

2. Tiny effects.

Associations between consumption of pro-inflammatory foods and brain volume were tiny – so tiny as to be meaningless given the measurement flaws described above. The sample was large, and a lot of analyses were run, and so any significant findings are most likely coincidental. (Fish around long enough and you may indeed catch something.)

3. Limited generalizability.

The study consisted of exclusively white participants, which of course limits the generalizability of the findings.

You might be wondering how it's possible, in these woke times, for a study with an exclusively white sample to be published in a reputable journal. This is because the participants are the offspring of the original Framingham Heart Study cohort, a multigenerational study that's been ongoing since 1948. Although newer cohorts sampled from the city of Framingham are racially diverse, the original participants were white, and their descendants will be white too (except when interracial marriages occur). In short, racial bias in 1948 led to a statistically biased sample in 2022. (Biased samples are, by definition, limited in generalizability.)

In sum, I don't believe the results of this study are informative. It's well-established that brain health is influenced by cardiovascular health, so if you don't want your brain to "shrink" or have other problems, I would recommend, along with physical and mental activity, a heart-healthy diet (see here, for example).

COVID-19 risk

I'm a great admirer of Anthony Fauci – in my opinion, he's one of the great heroes of the pandemic. At the same time, given that the pandemic is so complex, polticized, and rapidly changing, it's inevitable that Dr. Fauci will, on occasion, make a public statement that's less-than-ideally worded. (I'm not being polite here. Dr. Fauci makes a lot of public statements; it's inconceivable that they could all be perfectly phrased. Even Michael Jordan missed a shot now and then.)

This week, I read about fallout that medical professionals have experienced following statements that Dr. Fauci recently made about COVD-19 risk to ABC's This Week. Here are some key excerpts of what he said:

"...each individual is going to have to make their calculation of the amount of risk that they want to take in going to indoor dinners and in going to functions... This [COVID-19] is not going to be eradicated, and it’s not going to be eliminated...“So you’re going to make a question and an answer for yourself. … What is my age? What is my status? Do I have people at home who are vulnerable that if I bring the virus home, there may be a problem?... Again, each individual will have to take their own determination of risk."

Informally speaking, what Dr. Fauci is saying is perfectly sensible. However, his reference to individual determination or "calculation" of risk exacerbated an existing problem: physicians, nurses, epidemiologists, and wellness professionals are deluged with questions about the risk of engaging in such-and-such an activity – a track meet, a birthday brunch, a dental appointment, etc. Apparently, health care professionals experienced an uptick in questions like these following the good doctor's remarks.

The problem with such questions is that you can't calculate, or determine, individual risk for each situation; you can only minimize risk.

The risk of some outcome occurring is the probability or "chances" that it will occur. If you flip a coin and bet on heads, the risk of getting tails is .5, or 50%. If we find that 84% of Americans caught cold at least once between 2014 and 2019, then the probability of any one randomly chosen American catching cold during that time period was 84%.

As these examples illustrate, risk is easy to calculate when situations are simple (e.g., a coin-toss) or when they are presented simplistically (e.g., population-level data on colds for a finite time-period). However, in many situations, individual risk is impossible to calculate. Suppose, for example, you're considering a trip to the grocery store, and you want to take Dr. Fauci's advice and determine your risk of contracting COVID-19. The actual risk is determined by your age and health, the number and types of vaccines you've had, the density of people in the grocery store, the number of shoppers who are infected, the number of infected shoppers who are wearing masks, the extent of ventilation in the store, how close you come to any infected people, and whether or not you've chosen to wear a mask (not to mention what type of mask it is if you do wear one). Information about each of these variables is needed to calculate risk, but much of that information is lacking.

In short, we can't calculate individual risk of COVID-19 infection. All we can do is to minimize it. (Make sure you're vaccinated and boosted. Get a second booster, if possible. Wear a mask. Avoid the busiest times at the store, if possible. Try not to stand too close to anyone.)

This anecdote illustrates one of the limits on what statistics can tell us. We can be sure that certain behaviors diminish risk, without knowing by how much, or how much risk existed in the first place.

It may feel a little scary that we can't calculate risk, but I think that realizing this can reduce the decision fatigue many of us have experienced during the pandemic. There's no point in stressing about whether to go shopping when there are so many unknowns. Just make a decision, and if you choose to go, be as safe as you can be.

Appendix: Statisfied? present and future

Current status

Statisfied? now has 293 subscribers, plus others who access the newsletter each week via Substack, as well as search engines such as Bing and DuckDuckGo. (Substack is not directly accessible via Google.)

Substack, and comparable sites (e.g., Patreon) represent a growing trend among professional writers and other content creators. Journalists in particular have been drawn to Substack as print journalism languishes and online news organizations lose readership to social media sites where "news" is provided by aggregators or well-meaning (or not-so-well-meaning) citizens who aren't held to editorial standards.

A typical model on Substack is for the journalist to build a readership via free content, then charge users a small monthly fee. I will never do that. If you're reading this, I guarantee you free content for as long as you're interested. (If I eventually acquire thousands of readers, I may start charging new subscribers a few dollars per month. But not you. Substack has your e-mail address and the date you subscribed, and I will not charge anyone who signed up prior to 2023.)

Plans for the coming year

1. Grow my readership.

Writers are supposed to tell everyone about their newsletter – their Twitter followers, their dentist, random strangers at the bus stop, etc. I haven't done much of this yet. I don't really want to accost random strangers, but I do plan on other strategies for getting the word out.

2. Create a website.

The purpose of this newsletter is to comment on the use and misuse of statistics that I encounter each week, while touching on broader themes. On a website, I'd like to include other things:

(a) Content that isn't tied to the current news cycle (e.g., general guidance on when studies can be trusted, explanations of "significance" and other statistical concepts, etc).

(b) Guidance for journalists. (Editorial policy at many news organizations is vague or missing with respect to reportage on research methods and statistics.)

(c) Guidance for anyone who encounters stats in their work (e.g., educators, clinicians, policy-makers, graduate students, etc).

(d) Short blurbs on one useful application of statistics I encountered during the week, as well as one deceptive usage. (Sad to say, it's easy find at least one misused or misinterpreted statistic on any given week.)

3. Work on my book.

Both the website and the book I'm planning are meant to illustrate how our lives have been transformed by the development of statistics over the past century. Besides describing this transformation, my goal is to help people negotiate statistical information without being intimidated, and to recognize statistical misuse. Right now I'm trying to decide on level of coverage. At one end of the spectrum, the book could be a scholarly work tracing the rise of statistics and its application in STEM fields as well as its impact on public consciousness. At the other end of the spectrum, the book could be a simple, concrete, practical guide for the general public on making sense out of statistical data, with specific sections written for journalists, clinicians, graduate students, sports fans, and so on. Whatever I do, I want the book to be accessible to everyone.

Thank you again for reading!