Diet and Longevity
"Changing your diet could add up to 13 years to your life, study says"
I'm a sucker for headlines like this. I will read the story. Because I want to live longer. And because I have statistical questions. (13 years longer on average? Or did just one person live 13 years longer? What if I'm already 90 – can I get 13 more years?) This particular story made both national and local news this week and, in most cases, the journalists didn't answer my questions.
I thought a lot about the media coverage, thanks to a comment one of you made on last week's reader survey. (Survey results are given in Appendix A.) In response to a question about improving these newsletters, one of you wrote: "Talk about how we can help people present stats more accurately."
This comment reminded me of the need for better science journalism. As traditional journalism evolves, the number of journalists who specialize in scientific research continues to decline. Meanwhile, studies show that research findings are often distorted in news reports. Thus, there's an increasing need for research-focused guidance in journalism education and professional development.
In this newsletter I will do three things. I'll critique the live-13-years-longer study. I'll discuss how it was reported in the news media. And, I'll use these reports to illustrate one way science journalism can be improved.
The live-13-years-longer study
This study was published on February 8 in a prominent online journal (PLOS Medicine). The researchers are all faculty at University of Bergen in Norway, which is ranked among the top 1% among international universities.
The researchers began by identifying three kinds of diet: A Typical Western Diet, a Feasible Diet, and an Optimal Diet. Each diet was defined in terms of typical daily consumption of 14 different food types (whole grains, vegetables, fruits, nuts, legumes, fish, eggs, milk/dairy, refined grains, red meat, processed meat, white meat, sugar-sweetened beverages, and added plant oils). The definitions were highly specific. For example, a Typical Western diet was defined as 200 grams per day of fruits, 95 grams per day of red meat, 434 grams of milk/dairy, and so on for each of the 14 food types.
For each food type, an Optimal Diet is the cut-off point beyond which a change in the daily intake of that food doesn't affect longevity. For example, although studies have shown that people who eat more fruit live longer, there's no evidence that consuming more than 400 grams per day adds further benefits. So, in this study, the Optimal Diet included 400 grams of fruit (and 0 grams of red meat, and so on.)
Recognizing that people don't always make optimal food choices, the researchers created their so-called Feasible Diet by calculating the mid-points between typical and optimal values for each food type. Thus, for fruits, the value is 300 grams per day, while for red meat it's 47.5 grams per day, and so on.
Finally, the researchers reviewed the literature and estimated how much a change in consumption of each food type affects life expectancy (LE). Using these data, the researchers estimated differences in LE across the three diets overall, as well as how LE is impacted by changes in consumption of each of the 14 food types. The estimates that the researchers created were differentiated by age, gender, and nationality.
Now let's look back at that headline "Changing your diet could add up to 13 years to your life, study says." What the study actually said is this: If you're a 20-year-old American male who eats a Typical Western Diet and you suddenly switch to an Optimal Diet, and you stick with it for 10 years, your life expectancy will increase by an estimated 13 years. (For 20-year-old American women, the increased LE would be just over 10 years.) Expected benefits diminished with age, but even 80-year-olds of both genders and all nationalities were expected to gain an estimated 3.1 to 3.5 years of life by switching from Typical Western to Optimal eating.
Awesome, right?
Study limitations
I really wanted to love this study. The topic is important. The researchers were meticulous. They write well. And, they took the trouble to create estimates for a Feasible Diet, which acknowledges, wisely, that people don't always choose optimal eating habits.
Sadly, I found the stats too deeply flawed to draw any conclusions from the study.
1. Unreliable consumption data.
Daily consumption of different food types was gleaned from sources ranging from extensive literature reviews to highly specific empirical studies (e.g., "Per capita consumption of beans in the United States from 2012 to 2016, by type"). Here are two of the reasons these data are messy and unreliable:
(i) Most of the research relies on self-report surveys, but people aren't highly accurate about describing how much they eat, and the surveys are inherently limited in precision. For instance, you can ask people how many cups of milk they drink in a typical week, but the stats you get are misleading, because cup sizes differ, people don't always pour a full cup or finish what they pour, and the milk itself varies in fat content.
(ii) Existing reviews and studies were consulted in order to determine "typical daily consumption" for each food type. This might be feasible if "typical" pertains to a few days or weeks, but most studies focus on multi-year consumption. The current study, for example, assumes a 10-year period, but peoples' diets often change over the course of a decade, and I can't imagine how to incorporate those changes into a sensible definition of "typical daily consumption". (Stats can always provide definitions, but that doesn't mean they'll be sensible. A person who eats two eggs a day for 10 years consumes the same total amount of egg as a person who eats four eggs a day for 5 years and then gives up eggs, but I wouldn't call their typical daily consumption the same. The more varied your diet, the more challenging it may be to quantify "typical".)
Just for argument's sake, let's ignore the concerns I've raised here and assume that the consumption data were accurate.
2. Unreliable consumption-longevity estimates.
The researchers reviewed prior work and estimated, for each age group, gender, and nationality, how much life expectancy is affected by eating more or less of each of the 14 food types. Unfortunately, even if estimates of quantities are accurate, other health-related aspects of consumption are misrepresented in prior work and/or the researchers’ treatment of it:
(a) The researchers didn't consider differences in body weight. For example, 90 grams of red meat was treated the same whether the person who eats it weighs 110 pounds or 220 pounds.
(b) The researchers collapsed foods into broad categories – e.g., "fruits", "vegetables", "legumes", and so on. Thus, 80 grams of avocado was treated the same as 80 grams of watermelon or banana, 50 grams of iceberg lettuce was treated the same as 50 grams of broccoli or kale, and so on. (Method of preparation was ignored too, so 50 grams of raw kale was treated the same as 50 grams that had been boiled for two hours.)
(c) The researchers looked at food categories but not food constituents such as salts, fats, and cholesterols. Thus, 90 grams of lean, lightly seasoned steak was treated the same as the 90 grams of fatty, salty beef in a Big Mac. 40 grams of whole milk was treated the same as 40 grams of non-fat milk. Etc.
(d) To generate consumption-longevity estimates, the researchers examined sources ranging from meta-analyses to national databases to specific studies. Because many of these sources measured key variables in their own distinctive way, combining sources inevitably creates error. For example, in research on links between red meat consumption and health, some studies control for alcohol and tobacco consumption while others don't, some studies control for body mass index (BMI) while others don't, some studies control for overall fat and cholesterol consumption while others don't, and so on.
In sum, I don't think we can trust anything the researchers reported about the impact of diet on longevity.
3. Lack of model testing
The whole point of this study was to create a model – i.e., a formal description of how variables are related to each other. Specifically, the researchers created a structural model that spits out estimates of life expectancy based on any data you enter for any of its variables. If you currently eat a "feasible" amount of legumes, the model can estimate how much longer you'll live if you start eating an "optimal" amount of legumes, and it will provide a different estimate depending on your age, your gender, and your nationality. Same for each of the other 13 food types.
Models are only as good as the data used to construct them. Earlier, I argued that the researchers' consumption-longevity model is built on hopelessly inaccurate data. However, if the model had been tested, and it performed well, one might ignore, or at least downplay, my concerns. But there was no test.
What would such a test look like? One approach would be to track a sample of healthy young people over time, and observe how their diets are related to how long they live. That's called forecasting. Another approach, referred to as hindcasting, would involve looking at historical data on what people ate and how long they lived. As you can imagine, each approach presents unique challenges. A forecasting approach would require that you gather data from people over a period of decades. A hindcasting approach would require that you find people who kept extensive records on their daily eating habits before they died. So, no surprise that the researchers were unable to report a test of their model.
Media coverage
Looking back at the CNN headline that prompted this newsletter ("Changing your diet could add up to 13 years to your life, study says"), you can see now that it's accurate, but only if "could" is interpreted very, very charitably.
Most news organizations that ran the story used some version of this headline. Some got minor details wrong (e.g., "Changing your diet can add up to 10 years to your life expectancy, new study shows"). More importantly, some headlines failed to include that "could" (e.g., "Eating more of this adds 13 years of your life, new study says"), which wrongly implies that the study was empirical. The researchers estimated how much certain dietary changes could change life expectancy; they didn't observe anything.
CNN did a fairly good job of summarizing the study:
"The study created a model of what might happen to a man or woman's longevity if they replaced a "typical Western diet" focused on red meat and processed foods with an "optimized diet" focused on eating less red and processed meat and more fruits and vegetables, legumes, whole grains and nuts...If a woman began eating optimally at age 20, she could increase her lifespan by just over 10 years, according to the study... A man eating the healthier diet from age 20 could add 13 years to his life."
One small but interesting mistake in CNN's summary is that although the researchers considered red and processed meats, these two food types weren't the focus of the Typical Western Diet. They were merely two of the 14 food types under study, and neither received greater weight or attention than any other type. I assume CNN’s reportage was influenced by recent, highly publicized studies documenting the health risks of excessive consumption of red and processed meats.
Apart from this small mistake, CNN's summary is accurate, albeit incomplete. Other news organizations did fairly well too (although shorter articles seemed to present more distorted coverage). But none of these reports included a clear explanation of what a "model" is, and none of them, in my view, adequately warned readers about the limitations of the data. My concern is that two weeks from now, lay readers will remember "that study where people who ate healthy lived 13 years longer."
One could argue that the stakes are low here, because adjusting your diet to more closely resemble the optimal one probably won't hurt you. I agree, but why create pressure for people to do that? Given the flaws in the researchers' model, why tell people they'll live longer if they eat 200 grams of legumes per day as opposed to only 100? (If you take this study seriously enough to weigh your legumes on a kitchen scale, any gain in longevity might be offset by an increase in your stress levels.) In any case, there are many examples of journalists getting it wrong when much more is at stake (see here and here).
Improving science journalism: A concrete example
Most news organizations and professional societies for journalists have accuracy-related standards that apply to any nonfiction genre, including science writing. For example, the Code of Ethics for the Society of Professional Journalists advises journalists to "[p]rovide context. Take special care not to misrepresent or oversimplify in promoting, previewing or summarizing a story." Or, as my local PBS station puts it, "try to present the significant facts a viewer would need to understand what he or she is seeing".
A few organizations have standards specific to research or statistics. For example, the BBC's editorial guidelines include the following:
"We should reserve the same scepticism for statistics as we do for facts or quotes and not necessarily take numbers at face value... [We] should explain the numbers clearly, put them into context, weigh, interpret and, where appropriate, challenge them... The statistics must be accurate and verified where necessary, with important caveats and limitations explained. We should use a range of evidence to put statistical claims into context and help audiences to judge their magnitude and importance. Where claims are wrong or misleading, they should be challenged."
This is splendid, although the best journalists tend to do these things anyway on the basis of general principles of good journalism (e.g., "try to present the significant facts a viewer would need to understand what he or she is seeing"). At the same time, journalists sometimes miss, or misrepresent, key details and thereby mislead the public. This newsletter presents a case where the writers (a) didn't acknowledge, or clearly explain, that the data were modeled rather than observed, and (b) didn't note that the model was built from extremely flawed data.
What would help in this case? Well, in the training and professional development for journalists, and perhaps even in the editorial standards themselves, a standard such as the following might be useful:
"Indicate when findings are merely predicted rather than observed. Evaluate the basis of the prediction. Note whether the prediction was tested."
Apart from studies on diet and longevity, how could a standard like this help? Here are some examples:
1. More relevant coverage.
Climate change models have turned out to be successful at both hindcasting and forecasting, but this isn't always noted in stories concerning their dire predictions. I think it's vitally important to mention that these models have performed well in independent tests, and the standard I described above would remind journalists to do so.
2. Greater clarity.
Journalists have sometimes confused vaccine immunogenicity, efficacy, and effectiveness. Immunogenicity stats describe how a vaccine affects immune system functioning. Efficacy and effectiveness stats both tell you how much vaccines reduce adverse outcomes (e.g., infections) among folks who've been vaccinated versus those who haven't; the difference is that efficacy is determined under controlled, ideal conditions, whereas effectiveness represents the vaccine's real-world performance. The standard I described would help remind journalists that immunogenicity stats only roughly predict efficacy and effectiveness for outcomes such as infections, and efficacy stats only sustain rough predictions about real-world performance.
3. Less hyperbole.
Journalists want your attention, and these days they have to work hard for it. If the researchers themselves say that the right diet wins you 13 more years of life, or that eating one hot dog reduces your life by 35 minutes, then of course the journalist will want to echo those catchy phrases. Remembering that these are actually predictions from models, and then discovering the flawed assumptions underlying those models, might temper journalistic enthusiasm.
I will have more to say on this topic in a future newsletter. Enjoy your next meal (but don't weigh the legumes!).
Appendix: Reader survey results
92 readers responded to my 5-question survey about Statisfied?. Below are the results and the changes I'll be making. (The number of people who chose each response option is given in parentheses next to the option.)
Survey question: Do you want to receive updates on topics covered in the newsletters?
Your responses: Yes (49). No (6). Only if I request an update (25). No response (12).
My response: Stats can be read in more than one way. Although more than half of those who responded want updates, it's also important that 31 of 92 respondents (34%) wouldn't want unsolicited updates in their inbox. What I will do is to start including brief updates as appendices to my regular newsletters, rather than e-mailing these updates separately. (In the future, as I continue to work on my book, I hope to create a website that will include updates, among other things.)
Survey question: How is the depth of coverage in the newsletters?
Your responses: About right (79). Not enough detail (0). Too much detail (7). No response (6).
My response: It's telling that 100% of those concerned about depth of coverage indicated too much detail rather than not enough. (This makes me wonder how that "about right" was interpreted. Maybe some folks who chose that option feel there's a bit too much detail at times, but overall it's "about" right.) I will continue to streamline the newsletters through concise writing and appendices.
Survey question: Do you want a discussion thread where readers can post and respond to each other?
Your responses: Yes (26). No (32). Unsure (34).
My response: Ack! That's a very evenly distributed set of responses. Although 26 of the respondents (28%) want a discussion thread, actual interest could be a lot higher or lower depending on what the "unsures" finally decide. I'm not sure yet what I'll do, but I'm leaning toward making discussion threads part of the website I'm planning.
The final two survey questions ("Are there particular topics you want read about?" and "What would make the newsletters better for you?") were open-ended and yielded a small number of responses that didn't converge on any particular theme. The comments were helpful though and will inform future newsletters…