Poor Facebook. Yesterday, while the company was celebrating its first day as Meta, a congressional subcommittee spent over three hours scrutinizing why the company needs to be more accountable for the algorithms that influence what users see.
One of the witnesses testifying at this subcommittee hearing was ex-Facebook employee Frances Haugen, the whistleblower who leaked internal documents to the Wall Street Journal in September, and then to the SEC and Congress the following month. The WSJ's subsequent nine-part series, and several congressional hearings, have detailed how Facebook's algorithms promote political misinformation, hate speech, human trafficking, and other forms of social devastation.
This newsletter focuses specifically on allegations by Haugen, the WSJ, and others, that Facebook ignored its own internal research showing that Instagram harms teenage girls, an allegation that Haugen repeated at the hearing yesterday. I want to discuss exactly what Facebook's research shows, and whether the company's defense is credible. In the process, I will also touch on the broader theme of why caution is needed when translating statistics into plain English.
Context
On September 14, thanks to Haugen’s efforts, the WSJ ran a story entitled "Facebook knows Instagram is toxic for teen girls, company documents show". Facing enormous backlash, Facebook released multiple statements defending itself, and then, on September 29, the day before a congressional hearing on social media and mental health (and less than a week before Haugen was scheduled to testify), the company released the Instagram research reports Haugen had leaked.
These internal reports consist of two slide decks, one summarizing a large quantitative survey study, the other combining results from a small focus group study and survey. Almost all of the slides in these decks are accompanied by new annotations from Facebook elaborating on the content of the slides while responding to allegations from the WSJ and other critics. Facebook executives, particularly its head of research, have also responded separately to these allegations.
Here's the current state of the controversy:
Haugen, the WSJ, and others claim that Facebook intentionally ignored its own research showing that Instagram harms teenage girls. The statistic from this research most commonly cited is that about a third of teenage girls who already felt bad about their bodies reported that Instagram makes them feel worse.
Facebook claims that the WSJ and other critics have misrepresented the research. Viewed properly, the findings show that Instagram is mostly a good thing, although inconclusive regarding teenage girls in general.
The translation problem
Before delving into the research, I want to mention that some of the controversy has arisen from the way both sides have been talking about the stats.
Statistics is a language, and as with any language, translation into plain English will be more or less successful. Bad translations often occur when verbs like "is" and "can" are used to summarize findings. Consider again WSJ’s original headline: "Facebook knows Instagram is toxic for teen girls, company documents show". Facebook's response, of course, has been that Instagram "is not" toxic.
Arguing whether Instagram "is" or "is not" toxic is vague and enables lengthy, unproductive discussion. Here's an analogy: Most over-the-counter oral medications are safe at low doses but toxic if you ingest enough of them. Thus, it's not helpful to argue whether Advil "is" or "is not" toxic. If experts disagree about the exact quantity at which it becomes toxic, fine, they can have a data-driven argument. But if the argument is just about whether or not Advil "is" toxic, progress will be impossible without the specifics, because there's evidence to support both views.
As for the verb "can", at a congressional hearing in March, Mark Zuckerberg commented that “the research that we've seen is that using social apps to connect with other people can have positive mental health benefits". Well, yes, of course social media "can" have benefits. So can jogging on the freeway in rush hour. But even though freeway jogging "can" benefit your cardiovascular health, nobody recommends it, because the risks clearly outweigh the benefits. Zuckerberg's comment is an empty statement unless you weigh the specific benefits and costs of social media use.
Frances Haugen, the news media, and various members of congress have been using the "is" verb a lot lately, saying, for example, that Instagram "is" harmful for teenage girls. I believe that their criticisms of Instagram would be more compelling if consistently framed in the more specific terms of what Facebook’s research showed. Let's have a look at that research now.
Instagram study methods: Study 1
Facebook's first slide deck describes an online survey study of 22,410 Instagram users in the US, Japan, Brazil, Indonesia, Turkey, and India.
The initial question of the survey asked whether respondents had experienced one or more of six problems in the last 30 days. The six problems were selected at random for each survey out of a pool of 23, including relatively serious problems like suicidal ideation, as well as problems that might be considered less serious, such as conflict with someone close to you, sleep disturbances, and anxiety.
Each time a respondent acknowledged experiencing a problem, the survey continued with "deep dive" follow-up questions about how the person felt, what impact Instagram had on the problem, how they wished Instagram could have helped, and so on. Facebook ultimately chose to present data on 12 of the 23 problems (with no clear explanation for why the other 11 were excluded).
As I mentioned, nearly every slide in the deck was annotated by Facebook before release to the public. The left side of the slide deck now has a clearly marked-off section where Facebook provides clarification, corrects errors, and defends Instagram from critics.
Facebook's first line of defense: We weren't really studying Instagram's impact
In the annotations to literally dozens of slides, Facebook argues that critics have misunderstood the purpose of the research. Facebook claims that its goal was to understand Instagram users' perspectives and to better support them, not to understand how Instagram affects users' mental health and well being. Here's a typical example:
"Contrary to how the objectives have been framed, this research was designed to understand user perceptions and not to provide measures of prevalence, statistical estimates for the correlation between Instagram and mental health or to evaluate causal claims between Instagram and health/well-being." (Slide 2.)
The last part of that statement almost made me spill my coffee. Researchers rarely summarize their own research this inaccurately. Facebook was certainly evaluating "causal claims" (i.e., claims of causal links) between Instagram and mental health/well-being. Here are just a few excerpts from their research that illustrate the point:
"Across 12 mental health and well-being issues, we tried to understand the reach, intensity, impact of Instagram and degree to which users wanted us to provide support." (Slide 2, subtitle.)
"Objectives: We wanted to understand Instagram's role (for better or worse) in the hard experiences in peoples' lives. How does Instagram magnify or reduce harm, pain and support during these moments?) (Slide 3, title, subtitle, first line.)
"But, we make body image issues worse for 1 in 3 teen girls." (Slide 14, title.)
"1 in 3 teen girls blame Instagram for making their body image issues and problematic social media use worse. Social comparison is a high reach, high intensity issue that 1 in 5 teens thought we made worse." (Slide 19, subtitle.)
Some of these excerpts show that Facebook was clearly trying to understand the causal impact of Instagram on its users. Others (e.g., "we make body image issues worse for 1 in 3 teen girls") show that they clearly recognized that Instagram has causal impacts.
Facebook's second line of defense: Our results aren't generalizable
Another line of defense, repeated multiple times by Facebook, is that their results don't generalize to all users of Instagram, or to all teenage girls. The following annotation is typical:
"The estimates that 30% of users felt that Instagram made problematic use worse and that approximately 30% of teen girls also felt that Instagram made dissatisfaction with their body worse can be clarified as only applying to the subset of survey takers who first reported experiencing an issue in the past 30 days and not all users or all teen girls." (Slide 2.)
It's true that survey takers only received the "deep dive" questions I mentioned earlier for problems they acknowledged experiencing. The 32% of teen girls who reported that Instagram exacerbated their body dissatisfaction was not 32% of the entire sample of teen girls, but rather 32% of the teen girls who were already experiencing body dissatisfaction. Thus, Facebook is correct, and the WSJ (and many others) were wrong to say that the data show that Instagram increases body dissatisfaction among 32% (or "about a third" or "approximately 30%") of teenage girls in general.
All the same, it's callous and irresponsible that Facebook repeatedly corrects critics but never acknowledges a problem. The essence of Facebook's point here is this: Instagram doesn't hurt 32% of all teens; it hurts 32% of vulnerable teens. That's disturbing, particularly owing to the nature of the vulnerability. Studies routinely show that more than half of teenage girls are dissatisfied in some way with their bodies, and the incidence of Body Dysmorphic Disorder is roughly 2% among American women overall.
I find it not only troubling but surprising that in a politically sensitive era, one of the most profitable companies in the world doesn't even pretend to care that one of their products adversely affects about a third of a particular group of vulnerable teens.
Facebook's third line of defense: Instagram is mostly good for teen girls
Along with the annotations I've been discussing, Facebook's other public comments defending their research are pretty ugly too. Consider, for example, the following statement from Pratiti Raychoudhury, a vice president and head of research at Facebook:
"It is simply not accurate that this research demonstrates Instagram is “toxic” for teen girls. The research actually demonstrated that many teens we heard from feel that using Instagram helps them when they are struggling….In fact, in 11 of 12 areas on the slide referenced by the Wall Street Journal — including serious areas like loneliness, anxiety, sadness and eating issues — more teenage girls who said they struggled with that issue also said that Instagram made those difficult times better rather than worse.
As discussed earlier, I agree with Ms. Raychoudhury that it's inaccurate to say that Instagram "is" (or "isn't") toxic for teen girls. I also agree with her that "many" teens find Instagram helpful, because "many" is a vague term. However, in the last sentence of the excerpt above, "more" simply means that at least 51% of the girls said that Instagram made their pre-existing issues better. Our concern, however, should be the other girls – as many as 49% – who said that Instagram made those issues worse. In short, the usages of "many" and "more" are so vague here they constitute mistranslation of the statistics.
Ms Raychoudhury continues:
"The one exception was body image... one in three of those teenage girls who told us they were experiencing body image issues reported that using Instagram made them feel worse — not one in three of all teenage girls. This is an important difference that is not explicit in the Journal’s reporting. And, among those same girls who said they were struggling with body image issues, 22% said that using Instagram made them feel better about their body image issues and 45.5% said that Instagram didn’t make it either better or worse (no impact)."
Here again I'm struck by the insensitivity of the remarks. Yes, the WSJ and others got it wrong. The one-in-three figure (32%) doesn't refer to teenage girls in general. But we should still be concerned that Instagram worsened body dissatisfaction among 32% of girls who were already experiencing issues.
Another way to read the statistics from this excerpt is that among teen girls who were dealing with body dissatisfaction, Instagram made 32% of them feel worse and only 22% of them feel better. That’s awful. And yet, in the slide deck, and in Ms. Raychoudhury's comments in multiple places, the emphasis is on the good that Instagram does for teen girls (and others). A whole lot of good doesn't necessarily outweigh the bad. In reference to an external study, she says:
"The research shows that 346 teens in the US said Instagram made them feel much or somewhat better about their life, while 137 said Instagram said Instagram made them feel worse or somewhat worse about their life. [Thus] teens are more likely to perceive Instagram to have a more positive impact on how they feel about their lives than not."
Imagine if this logic were applied to a prescription medicine: "346 teens said that Prozac made them feel better, while 137 said that Prozac made them feel worse." You would be concerned, right? Although you could conclude that teens are more likely to view Prozac positively, you would also worry about about the well-being of those 137 teens.
It's like Facebook is saying: When it comes to body image, Instagram doesn't hurt 32% of all teen girls. It only hurts 32% of the ones who are vulnerable. But that's ok, because the others are either helped (22%) or not affected at all (68%).
A deeper issue
So far I've argued that Facebook's rebuttals are both meritless and callous. The finding that Instagram exacerbates body dissatisfaction among 32% of teen girls should be cause for serious concern. And there's more.
Both scholars and casual observers have claimed that Instagram causes the very problems that teenagers seek relief from via Instagram. As Frances Haugen has noted, teen girls "develop these feedback cycles where they are using Instagram to self-soothe but then are exposed to more and more content that makes them hate themselves." Ms. Haugen, an expert in the algorithms that guide content to users based on their activity, commented during congressional appearances on how the algorithms result in increasingly extreme, potentially harmful content. (Even the girls are aware of it on some level: As noted in slide 35 of the second deck, "Teens called out ad targeting on Instagram as feeding insecurities, especially around weight and body image").
To the extent that such feedback cycles exist, Facebook's stats on the benefits of Instagram become less compelling. It's like saying that Prozac is among the causes of depression, but 22% of people who take Prozac report that it alleviates symptoms, and only 32% get worse. If that were the case, then people should clearly stop taking Prozac, because it causes depression, and it also makes depression worse in about one third of people.
I'm not suggesting that we should ban Instagram (or Prozac). I'm only saying that Facebook's defense of its research represents a callous misreading of its own statistics.
Instagram Studies 2a and 2b
The second slide deck released by Facebook combines (often without clear distinction) data from a qualitative focus group study of 40 teenagers (ages 13 through 17) in Los Angeles and London, and from an online survey of 2,503 teenagers of the same age range.
Here are few slide titles highlighting negative impacts of Instagram:
"One in five teens say that Instagram makes them feel worse about themselves..." (Slide 21.)
"Teens blame Instagram for increases in the rates of anxiety and depression among teens." (Slide 24.)
"Teens who struggle with mental health say Instagram makes it worse." (Slide 25.)
"Teens called out ad targeting on Instagram as feeding insecurities, especially around weight and body image." (Slide 35, first bulleted item.)
Facebook's first line of defense: We couldn't study Instagram's impact, because we can't trust what our participants said
Facebook's annotations contain several versions of the following:
"All results are based entirely on the perceptions of participants and are not designed to evaluate causal claims between Instagram and health/well-being." (Slide 3.)
In other words, teens who blame Instagram for their problems might be wrong, thus our study isn't able to describe how Instagram impacts those problems. The reasoning here is made more explicit in an annotation to Slide 35, which states that "teens who struggle with mental health report that Instagram makes it worse." Facebook's annotation casts these as self-reported perceptions which may just reflect lower life satisfaction among teens rather than a problem with Instagram.
I think Facebook is being especially mendacious here. Self-report is an established, widely-used method for identifying mental health issues as well as better understanding the causes. Such methods are widely used because people tend to be pretty authoritative sources of information about themselves.
Of course, people are also fallible. A person might blame Instagram for problems actually caused by something or someone else. They might do so because they're embarrassed, or confused, about the true source of the problem. But we don't assume this typifies all participants in a self-report study, but rather just a few, and we take steps to minimize these possibilities (as Facebook did) by administering surveys anonymously, using clearly worded questions, and so on.
It's incredibly dismissive to suggest that because teens were reporting their own perceptions, nothing can be revealed about the impact Instagram has on them. The whole point of the research, according to Facebook, was "to understand user perceptions." Elsewhere Facebook is not shy about mentioning the good things teens say about Instagram. It's only the critical remarks that the company dismisses, while hiding behind claims about the limitations of self-report data.
Facebook's second line of defense: The results don't generalize
Because Facebook was interested in learning more about Instagram users, their annotations repeat more than a dozen times that the results are “not intended to be representative of the experience of all teens”. This is very lawyerly phrase that conflates representativeness with generalizability. Imagine that 18% of teenagers who take a particular drug develop a rash. Of course that 18% statistic only represents the experience of teens who took the drug. One assumes that among teenagers who don't use the drug, the incidence of rashes is much lower. At the same time, one would expect that if more teenagers started taking the drug, roughly 18% of whoever uses it would develop rashes. That 18% statistic generalizes, in other words, and so do Facebook's findings.
Annotations to a section entitled "The categories of harm on Instagram" (Slide 28) go a step further. Facebook claims that data in this section is drawn from "a small set of personal interviews and is not generalizable to the Instagram user population". That's just not supportable, because Facebook states again and again that the whole point of their studies was to learn more about Instagram users. You can’t have it both ways. Either your study tells you something about the users or it doesn’t. You can’t question the generalizability of your findings only when they give you bad news.
Conclusions
Yesterday, Frances Haugen and the congresspeople she addressed were in agreement that Facebook has ignored the findings of its own research on Instagram. What they mean by "ignored" is that Facebook has refused to acknowledge the harm that Instagram causes to some users via the platform’s algorithms. However, thanks to the efforts of Ms. Haugen and others, Facebook has not ignored public outcry about the issue.
One thing Facebook has done, as I've shown in this newsletter, is to publicly misrepresent the focus and findings of its Instagram research. In addition, less than two weeks after the first WSJ article appeared, Facebook paused their development of Instagram Kids, which they’d been intending for use among 10 through 12 year olds.
Adam Mosseri, the head of Instagram, officially announced the pause on September 27. It’s interesting that this announcement gives no clear reason for the pause. Here’s how Mr. Mosseri put it:
"We’ve decided to pause this project. This will give us time to work with parents, experts, policymakers and regulators, to listen to their concerns, and to demonstrate the value and importance of this project for younger teens online today."
That sounds reasonable…except that Instagram had already been working with stakeholders, listening to their concerns, etc. for over six months. What does it mean to pause in order to continue to do what you're already doing?
In my opinion, the project was "paused" owing to congressional hearings and bad press, including the WSJ series (which Mr. Mosseri references in his announcement). I assume Facebook's hope is that once the news cycle moves on, Instagram Kids will be more favorably received. But will it be safe for kids?
In her congressional testimony, Frances Haugen has repeatedly noted that Facebook won't change until its incentives change. I agree. Facebook never expected to defend its Instagram research. The company was only incentivized to do so by the prospect of lost revenue, and increasing legislative oversight, once key documents were leaked. The fact that Facebook's defense was mendacious, dismissive, and callous is merely a reflection of the character of company leadership. However, the weak logic of the defense doesn’t mean that they’re dummies, but rather that they were faced with an impossible task. Hopefully, people will remember the Instagram research, and keep the pressure on Facebook so that, at minimum, both Instagram and Instagram Kids can be a healthier part of the “metaverse”.
Thanks for reading!