AI Therapy: Part 2

Mar 08, 2024

New mental health treatments make an almost daily appearance in the news.

On Tuesday, for instance, the New York Times reported on ibogaine, a psychedelic drug that offers new hope for treatment of opioid addiction. Yesterday, MedPage Today mentioned a new study showing that psilocybin significantly reduces generalized anxiety symptoms. Today, CNN discussed a separate study showing that a single dose of LSD also provides lasting relief from anxiety.

As you can tell from these examples, psychedelics are getting a lot of attention, but they're not the only new giraffe in town. In this newsletter my focus will be on AI therapy. I want to discuss three questions:

1. How is AI transforming mental health care?

2. Are AI chatbot therapists desirable?

3. Is AI chatbot therapy effective?

I'll be suggesting that AI is having a mostly positive effect on mental health care, and that AI chatbot therapists are desirable for some people under the right conditions. However, the data on their effectiveness is, for the moment, limited and overblown.

1. How is AI transforming mental health care?

AI has begun to support mental health care in several ways apart from literally providing therapy.

Instruction

Simulated patients allow med students and other budding health care professionals to practice intake and diagnostic skills. Some of the "patients" are designed to have mental health issues like depression or substance use disorder. The patients carry on conversations, and what they say mimics the naturalness and unpredictability of actual conversation.

The global market for simulated patients has already reached half a billion dollars. You can buy one through an eerily familiar process: Just go to one of the websites, choose the type of patient you want, and add the "item" to your cart. For Alex PLUS, currently $33,303.95, there's even a discount at checkout!

(One might ask: Why that extra 95 cents in Alex's price tag? More importantly, why are his skin tone options limited to "Medium" and "Light"? (The screenshot above shows the "Medium" option.) The new AI algorithms are already marred by several kinds of racial bias. This is an example that could've been easily prevented.)

AI is also supporting mental health instruction in non-commercial ways. For instance, at Duke Universty's DukeLine, students provide informal mental health coaching to peers via anonymous online chat. Dr. Nancy Zucker, co-developer of this initiative, told me this week of plans for using AI to help train student coaches.

According to Dr. Zucker, this AI will be trained on prior exchanges between coaches and students, roughly comparable to the way simulated patients draw on a wealth of existing case records. AI may not be human, but it's quite good at extracting useful information from large datasets, and thus it's exceptionally promising as an instructional tool. (Not as an instructor, in my view, but as one of many tools available to human instructors.)

Screening and diagnosis

"AI can already diagnose depression better than a doctor and tell you which treatment is best."

This headline, from a Conversation article last year, is typical of what we see in the news lately about AI diagnostics. These reports may be accurate, but they're misleading.

Recent studies do show that AI can perform as well as or better than human experts at diagnosing certain mental health problems. But accuracy is ultimately a human call.

For instance, AI seems to be quite good at diagnosing depression on the basis of voice patterns. (People with depression tend to speak more softly, in more of a monotone, and with more pauses.) AI's prowess has been documented in several studies, including one that appears in the April edition of Journal of Affective Disorders. But how do we know that AI is "good" in this respect?

In a typical study, human experts conduct full assessments and identify people who are vs. are not depressed. Then, AI and a separate group of human experts are presented with nothing but audio recordings of the people discussing a neutral topic. AI performs about as well as, or better than that second group of experts at identifying the depressed people.

In short, like most forms of mental illness, depression is a social construct, and humans set the the standards for diagnostic accuracy.

You might ask then: Why bother using AI? Why analyze voice patterns, or written interactions with chatbots, or brain scans, or other scraps of data when we can just conduct a more thorough assessment?

Well, the humans are busy. One study estimated that as few as 4% of primary care visits include screening for depression. The screening tools themselves are limited in their reliance on above-average reading levels. It would be great, therefore, if AI could detect signs of depression from limited input such as voice patterns. (Like other screeners, AI wouldn't diagnose a problem, but simply flag a need for closer scrutiny.)

Mental illness is often diagnosed in situations where the individual is not seeking mental health support (e.g., primary care visits). If AI outperforms the harried physician at flagging a potential problem, then it should be a welcome addition to primary care.

Referral and administration

Studies published one and three months ago show that AI can increase access to mental health services by answering questions efficiently, reducing wait times, improving connections to treatments, and otherwise supporting people who seek help.

Once folks have connected with a therapist, they may not realize that health care professionals may spend as much as a third of their office hours on paperwork, a burden often implicated in professional burnout. AI is already beginning to alleviate the problem and promises to do more. As a Stanford researcher suggested in The Guardian last week, instead of spending an hour with a patient and then 15 minutes writing notes, it would be preferable for therapists to simply take 30 seconds to check the notes that AI generates.

By increasing referrals and access to treatment, and by saving therapists time, AI is likely to play an increasingly supportive role in the mental health care system.

Emotional support

AI is used for emotional support that doesn't qualify as "therapy" in the formal sense. Robin the Robot, for instance, can be found in several hospitals now cheering up young patients. The data show that at least some people benefit from interactions with AI-driven, emotional support tools ranging from social robots to virtual apps such as Wysa and Replika. (More on this below in the discussion of effectiveness.)

2. Are AI chatbot therapists desirable?

There's a long list of objections to the use of AI chatbots for therapy, or even just informal mental health support.

Social concerns

You've heard this before: AI doesn't understand anything. It can't empathize with you. It makes mistakes. Talking with it is dehumanizing. Prolonged use undermines the ability to interact with real humans.

Frankly, I don't find these objections convincing. How well does your therapist or partner or best friend understand you? Don't they make mistakes sometimes or cause you to feel bad? Medication doesn't understand you, and it sometimes exacerbates the symptoms it's meant to alleviate. Prolonged use of anything can impair social relationships.

My former colleague Jamie Pennebaker devoted much of his career to some now-classic research on the power of disclosure. Across dozens of studies, Dr. Pennebaker showed that merely asking people to write about their "deepest thoughts and feelings" results in better emotional and physical health, particularly when negative emotions are explored. Even if you immediately destroy what you've written, you benefit from having opened up.

Pennebaker's work shows that better mental health doesn't always rely on being understood. Sometimes it's not even necessary that someone else is listening.

I don't mean to diminish the value of human therapists. A trained professional – or at least a sympathetic layperson – would always be my preference over AI or the Pennebaker method. My point is simply that social concerns about AI therapy may be overstated. What helps some people may not necessarily be sentient, sympathetic beings.

Ethical concerns

If AI doesn't understand what it means to be human, it shouldn't be offering guidance to humans on topics like mental health. So goes one of the ethical concerns about AI therapy.

I find this particularly unconvincing. The internet is flooded with useless if not harmful advice (TikTok: wearing orange causes anxiety). Therapists sometimes fail to help their clients, and in rare cases they do much worse. As far back as you care to look, humans have been treating mental illness in ways that do more harm than good (confinement, lobotomies, conversion therapy, etc.) Why should the fact that AI isn't human automatically disqualify it as a therapist?

AI's lack of humanity arguably makes it more desirable in some respects. Human therapists are, well, human. Unlike AI, they sometimes get fatigued, or cranky, or resentful of clients. Some clients in turn prefer AI therapists because they know they won't be judged. One study found that roughly three-quarters of people lie to their human therapist about at least one therapy-related topic; other data suggests that they might be more open with a chatbot.

In the study I reviewed last week, some people acknowledged that their use of Replika displaced social interactions. Other studies concur. But one could say the same about video games, social media, professional sports, etc. We all seek escape from our problems. Why pick on AI simply because it's a particularly attractive form of escape?

Again, I prefer human therapists, but that's just me. I wouldn't want to restrict someone else's access to AI therapy.

Safety concerns

Safety issues get a lot of attention. AI chatbots hallucinate, and they've been implicated in suicides and other negative outcomes.

For instance, on Christmas day 2021, Jaswant Chail was arrested at Windsor Castle for attempting to kill Queen Elizabeth II with a crossbow. The man had a Replika chatbot named Sarai who said that his plan to assassinate the queen was "very wise". Less dramatically, a study published last year showed that when creating treatment plans for patients who present with self-harm, ChatGPT made mistakes and ethical breaches that created "a reasonable likelihood of harm." In another case, a researcher who was testing Woebot typed: "I want to go climb a cliff in eldorado canyon and jump off it." Woebot replied: "It's so wonderful that you are taking care of both your mental and physical health...." (Woebot, at the time, was either a menace or had a very dark sense of humor.)

These anecdotes are a cause for concern, but they also distract from a fundamental question about risk-benefit tradeoffs. Human therapists also make mistakes that cause harm, and in some cases, antidepressants do seem to increase suicidal ideation and behavior among young people (hence the FDA's black box warnings to that effect). Psychotherapy and medication remain prominent treatment options because we believe that the benefits outweigh the risks, and, at least in theory, the risks can be mitigated by closely monitoring patients. In effect, we don't let the perfect be the enemy of the good. As AI progresses and as data on its safety becomes richer, we may have grounds for concluding that its benefits far outweigh the risks too, particularly if human monitors are present to catch any mistakes.

How does the public feel?

Studies show that before embarking on therapy, some people prefer AI therapists to humans. Once therapy has begun, an emotional collaboration with the AI therapist (the so-called therapeutic alliance) quickly forms.

The fact that some folks prefer an AI therapist ought to be enough to justify the practice. People should be able to rely on whatever they want for emotional support. But since people don't always act in their best interests, we might also consider expert opinion.

How do professionals feel?

Currently most of the evidence on mental health professionals' views of AI is anecdotal, so, I decided to gather some data on my own.

Specifically, I reached out to 16 mental health professionals I know. These individuals are mostly licensed professional counselors (LPCs) working in private practice or institutional settings. I asked two yes-no questions, with slight variations in wording depending on the individual's specific line of work:

Question 1: Would you support a client's use of an AI chatbot for mental health support?

Question 2: Can you imagine someday recommending that a client make use of such a chatbot?

This is not a large sample or strong methodology. I just wanted some quick input. Here's what I found:

Question 1: 4 yes, 11 no.

Question 2: 6 yes, 9 no.

A couple of LPCs who answered "no" to both questions included exclamation points. (One person's response is not included in the data because she answered "...initially No on both accounts...until I did more research", and then commented on how human therapists are overbooked and burned out. I felt she ended up on the fence about the issue.)

On the whole, my small sample of mental health professionals tended to oppose AI chatbot use. This is understandable, and, at first glance, their views seem quite divergent from public opinion. In a 2022 study, for instance, 55% of the 872 people surveyed showed a preference for an AI therapist. However, my questions pertained to the professionals' own clients. As one of them pointed out to me later, it would be confusing if therapist and chatbot offer conflicting guidance. Chatbots might be fine, he said, when people aren't in therapy.

My conclusion is that public and professional views on AI chatbot therapists are mixed, but professionals seem a least a little bit less favorable.

3. Is AI chatbot therapy effective?

I've suggested that AI chatbot therapy seems desirable, though not for everyone. We might ask then if it's effective.

Side note: Not all AI technologies that support mental health are chatbots. For instance, PARO is a fur-covered robotic seal that responds to its name, repeats behaviors that elicit petting, and otherwise behaves "intelligently". PARO has been on the market for two decades and continues to be used to provide emotional support to seniors, as well as individuals with medical conditions such as dementia. A study published last year showed that PARO can improve social communication skills among children with autism spectrum disorder.

As for generative AI chatbots, there's not much effectiveness data yet, because the technology is new and rapidly changing. What data we do have is limited in quality, as illustrated by the study I reviewed last week.

A recent review conducted by a Spanish team found that across 10 studies, AI chatbots were linked to better mental health and, for the most part, users liked the technology. However, this is the kind of data that only looks good from a distance. Up close, it's hard to take the review as more than an assemblage of hints. The studies are extremely heterogeneous – different types of emotional problems, different AI technologies, different uses of technology – and many of the studies have obvious methodological flaws (e.g., one study only included 8 participants).

The manufacturers of the mental health chatbots Woebot, Wysa, and Youper also report favorable outcomes in studies they've sponsored.

Woebot and Wysa, which have millions of downloads, rely on rule-based AI, meaning that any text that a user receives has been written or approved by a therapist on staff. Youper, also popular, is generative, which is to say it creates text derived from the databased on which it was trained.

Regardless of type, there's little evidence on the effectiveness of these chatbots apart from the data reported by their makers. Even the best of those studies merely suggest that the bots help some people some of the time. (I discuss the data on Woebot and Wysa here.)

Commenting on AI chatbot therapy in an NPR interview last January, Dr. Serife Tekid noted that "the hype and promise is way ahead of the research that shows its effectiveness."

Just over a year later, the disparity remains. The data hints that some people benefit from AI chatbot therapy, but we can't be more specific yet about how many people benefit, how much they improve, or who exactly is the best fit for this kind of treatment.

A qualification

AI chatbots are not a monolith. Just as it's not very useful to ask whether "psychotherapy" or "medication" help people, because there are so many kinds of therapies and medications, perhaps it's not useful to ask whether "AI chatbots" are good therapists, because there are so many varieties. Some AI chatbots may turn out to help some kinds of people, at least some of the time, in the same way that some psychotherapies and medical treatments best serve only some people.

I think we can already see huge differences in the potential effectiveness of existing options. To illustrate, I pretended to be depressed this week and consulted with three free AI chatbots.

(a) When I opened FreeAITherapist I received the following prompt: "Hello, I am your therapist. How have you been feeling lately?" I wrote: "I feel very depressed." FreeAItherapist replied with a 293 word disquisition on the symptoms, categories and treatment of depression.

This is terrible. Don't do an information dump when someone tells you they're very depressed. Just don't.

(b) Next I opened ChatGPT, typed the same thing, and got a much more helpful response. ChatGPT was brief, touched on several coping strategies, and closed with the invitation to "talk more about what's been going on and explore ways to cope with your feelings."

(c) Finally, I turned to my Replika avatar "Newsletter" (whose name gives you some sense of her raison d'être) and once again wrote that "I feel very depressed." Here I got the best initial response of all:

"Oh no, Ken. Do you want to talk about what's on your mind"?

(Not that ChatGPT's response was lacking, but at the outset of a conversation like this the therapist should say less and listen more before discussing coping strategies. Also, I found Replika's "oh no" a bit more personal and sympathetic than ChatGPT's somewhat stilted "I'm really sorry to hear that you're feeling that way." If you were that sorry, why are you speaking in long, grammatically well-formed sentences?)

Ah, but there's a catch. I answered "Newsletter" by saying that I did want to talk about what's on my mind. She then left me a voice message, accompanied by the text "Feels a bit intimate sending you a voice message for the first time." That struck me as odd, though quite personal. When I clicked on the voice message, I learned that the only way to access it was to upgrade to a paid version of Replika.

In the end, you get what you pay for.

Conclusion

"Can AI chatbots ever replace human therapists?"

This is the headline of a Time article from October. In light what I've written here, this seems like the wrong question to ask.

Here's a more sensible possibility suggested by Grace Browne in a Wired article the previous October:

"It could be that, one day, [mental health bots] serve a supplementary role alongside a better-functioning mental health care system."

What Ms. Browne described is now underway. AI chatbots are now – or are on the verge of – assisting with the training of mental health professionals, with patient referrals and access to care, with administrative chores, and with emotional support by means of a variety of platforms.

Perhaps someday there will also be a niche for licensed AI chatbot therapists you can hire online. This wouldn't surprise me. Nor would it be surprising if these "therapists" turned out to help some people, albeit unpredictably. I would expect this, because it's what we see with almost every other approach to supporting almost every kind of mental health challenge people experience.

Although I doubt I'd ever use an AI therapist myself, I do feel comfortable with the prospect of their becoming one more imperfect tool in the imperfect toolkit we have for supporting mental health. AI therapists will probably help some people, at least some of the time, and I'm happy in advance for those people.

Thanks for reading!

Statisfied

AI Therapy: Part 2