Will ChatGPT Increase Cheating?

Feb 02, 2023

To get a taste of how statistics has transformed society, check out the news this week: Republican voters favor Ron DeSantis over Donald Trump in 2024. Bivalent boosters will soon be the only COVID-19 vaccines. There's a 7% chance of rain in Dallas tomorrow. LeBron James needs 117 points to become the NBA's all-time leading scorer.

Elections, medicines, weather forecasts, and sports have been around for thousands of years. What's special about the last hundred years or so is the use of statistics to analyze polling data, test the effectiveness and safety of vaccines, predict the weather in probabilistic terms, and track athletic performance over time.

ChatGPT and academic dishonesty

ChatGPT also made the news this week – as it has every week following its public release last November 30. As I'll explain shortly, the development of this chatbot represents yet another way that statistics has been transformative.

On Tuesday, OpenAI, the company that makes ChatGPT, rolled out a new free tool that allows users to check whether a passage was written by an AI program or a human. This feature was created in response to alarm over how convincingly ChatGPT simulates human writing. Among other things, educators are worried that ChatGPT will spark an increase in the prevalence of academic cheating.

In this newsletter I will suggest that we don't need to be overly worried. Although there are anecdotal reports of students using ChatGPT inappropriately, statistical data, along with other considerations, tell us that apps like this won't be increasing academic dishonesty by much in the near future.

What is ChatGPT?

Chat Generative Pre-trained Transformer, or ChatGPT, is an app that can converse with you, answer questions, write essays, compose music, create and debug computer programs, and much more.

"Generative" means that it can generate new content, not just regurgitate it. For example, I asked ChatGPT to write a four-line poem, in the style of Robert Frost, on the power of statistics. Within a few seconds, here's what it came up with:

"In numbers and data, lies power untold, / Revealing the truth, in ways unimagined. / With insights to share, that never grow old. / Statistics shape our world, to be better refined."

It's an awful poem, but surely an original one.

"Pre-trained" refers to the fact that ChatGPT was fed a massive dataset (OpenAI doesn't say how large) consisting of books, articles, blogs, social media posts, and so on. ChatGPT isn't connected to the internet though. Instead, it answers questions on the basis of its training dataset, which includes some hundreds of gigabytes of materials published or posted up through 2021.

Because ChatGPT can process billions of words per second, it's the most powerful AI language processor ever created. However, ChatGPT doesn't "understand" what it reads and writes. Rather, it understands the structure of language and what words and phrases tend to be more or less strongly related, thanks in part to the application of probability theory. For example, when I asked ChatGPT why some people prefer dogs over cats, it referred to qualities like loyalty and affectionateness, because probability statistics indicate that these are among the qualities people most often associate with dogs.

Finally, "Transformer" refers to the way ChatGPT is able to weigh the importance of different text elements and use probabilistic methods to determine which responses are more or less likely to be suitable. For example, ChatGPT knows that as pets go, dogs tend to be larger than cats ("As a general rule, dogs tend to be larger than cats" it told me.). But ChatGPT didn't list size as a reason why some people prefer dogs, presumably because in its training dataset, anything it classified as a positive attitude toward dogs must've rarely included references to size.

You can try ChatGPT yourself here, though as of this writing it's temporarily closed to new registrants.

The problem

Since ChatGPT is free at the moment (and will be available soon for $20 per month), educators are concerned that students will rely on it to complete homework, write term papers, answer test questions, etc. In spite of ChatGPT's occasional errors, the content it generates can be startlingly good and has led to some gloomy predictions about the future of both K-12 and higher education (see, for example, an Atlantic article entitled "The end of high school English").

In order to predict whether ChatGPT will increase academic dishonesty, we have to know how much students currently cheat. Here's where things get a little tricky.

How prevalent is academic cheating?

The clearest message from the scientific literature is that most students cheat in school. What does "most" mean? That varies from study to study. The largest studies of academic dishonesty have been led or inspired by Donald McCabe and the International Center for Academic Integrity. For instance, in surveys of over 70,000 high school students, 95% admitted to having engaged in some form of academic cheating, while in surveys of over 70,000 undergraduates and graduate students, nearly two-thirds admitted to having cheated at least once in college. Studies conducted at specific institutions have yielded lower numbers for high school students (e.g., 65.7% at one high school) and higher numbers for college students (e.g.,74.2% of undergraduates at Texas Tech). Other studies have yielded values somewhere in between, or slightly outside of, those cited here. There aren't clear distinctions in the data between those who have cheated once in their lives, once during some specific time period (e.g., in the past year), and more than once during any time period.

Inconsistencies in the prevalence data stem from the use of different samples, and from the fact that cheating is inherently difficult to measure. We'll never know exactly how much academic cheating there is, because it's a covert, illicit behavior. Objective measures underestimate actual incidence, because not everyone gets caught. More reliable estimates come from self-report studies, such as the ones I mentioned above, but they still underestimate prevalence. People who cheat may understandably decline to participate in studies on academic dishonesty. Among those who do participate, some may not admit to cheating, or they may underreport how often they've done it, because they think it makes them look bad, or they're in a state of denial, or they're afraid of being exposed (even though they've been told their responses are recorded anonymously and kept private). There are also gray areas where, say, over-reliance on a source would be considered plagiarism by the teacher but not by the student.

In sum, most students will cheat, at least once before they graduate, but the actual prevalence is uncertain and may always be.

Will ChatGPT cause a substantial increase in cheating?

I don't think so.

There are at least two changes to worry about here:

(1) An increase in the number of students who cheat at least once in their lives.

(2) An increase in the extent of academic dishonesty among students who already cheat.

Regarding the first concern, the data suggest that most students cheat, at least on occasion, and so we may have a sort of ceiling effect. That is, ChatGPT may not be able to substantially increase the number of students who cheat, because so many have done so already (or would do so by the time they graduate, even if they had no access to AI).

Presumably, some students will not cheat under any circumstances, owing to their moral beliefs and/or a fear of getting caught. The availability of ChatGPT won't affect the behavior of these particular students.

In short, if most students will cheat at least once by the time they graduate, and at least some of the ones who don't would never do so under any circumstances, then a new AI app such as ChatGPT couldn't create very many new cheaters.

Regarding the second concern, students already have access to countless resources that support academic dishonesty, including Google, "homework help'' websites like Chegg, paper writing services, and individuals who support contact cheating. To the extent that ChatGPT simply replaces some existing method, there won't be a net increase in academic dishonesty. For example, when a student who wants someone else to write his term paper uses ChatGPT instead of Chegg, we can't say that ChatGPT has increased the prevalence of cheating. Rather, one tool has merely supplanted another one, and the overall prevalence remains the same.

My point here is that ChatGPT may not increase the extent of academic dishonesty among students who already cheat. Rather, students may simply rely on the new technology instead of other tools, and as time goes on, the incidence of cheating will be more strongly influenced by other variables known to have an impact. Students are more likely to cheat if they see others doing so, if they're disengaged from school, if they're taking classes that emphasize performance rather than learning, and if they're experiencing academic pressure but insufficient support. These are a few of the dozens of variables that studies have found to be influential.

What teachers can do

Most educators are devoted and deeply resourceful. They will find ways to adjust to new technologies, as they always have. in my view, the most productive approach to managing AI in the classroom consists of two strategies.

First, teachers can incorporate ChatGPT and other AI resources into writing assignments rather than prohibiting them. ChatGPT can be a useful source of information, ideas, organizational suggestions, and other supports for writing. Teachers can help students understand these supports, while encouraging reflection on the limitations of AI, such as inaccuracies, lack of sourcing, and a total absence of comprehension and empathy. So long as teachers clarify which uses of AI are vs. are not appropriate, the technologies can be beneficial.

Second, teachers can adjust the content and format of tests, assignments, and other written work in order to discourage inappropriate use of AI. For instance, to the extent that it's logistically feasible, tests can be administered via paper-and-pencil, or on computer systems that have no internet access, and students can be prohibited from using their cell phones while being tested. The content of remotely-completed assignments can be tweaked in ways that inherently limit the support an AI system can provide. For instance, consider one of Robert Frost's most famous short poems:

Nature’s first green is gold,
Her hardest hue to hold.
Her early leaf’s a flower;
But only so an hour.
Then leaf subsides to leaf.
So Eden sank to grief,
So dawn goes down to day.
Nothing gold can stay.

When I asked ChatGPT for a one-paragraph summary of the poem's meaning, it gave me a clear, concise piece of writing that would surely impress a high school English teacher in spite of a touch of vagueness:

"Nothing Gold Can Stay" is a poem by Robert Frost that explores the fleeting nature of beauty and the inevitability of change. The poem reflects on the idea that everything that is beautiful and pure in life is eventually lost and fades away. The poem emphasizes the importance of cherishing and appreciating the beauty that we experience while it lasts, as it is only temporary. Through its imagery and melancholic tone, the poem serves as a reminder of the transience of life and the need to make the most of our time.

But what if the English teacher had tweaked the assignment slightly and asked for a brief summary that incorporated an example from the student's own life? Since ChatGPT doesn't have its own life, I asked it for a one-paragraph summary that incorporated an example from the life of a high school student. The program added the following:

For a high school student, this might manifest as the realization that their time in high school is coming to an end and that they are transitioning into adulthood. For example, a student who has spent their entire life in high school looking forward to graduation day, may feel a sense of sadness when they finally reach that day and realize that they must leave behind the people and experiences that have shaped their lives.

This is good stuff, but it's quite generic. In order to do well on the assignment, the student would need to reframe this passage in personal terms. More detail is needed about what they'd feel sad to leave behind – and about why certain people and experiences had once seemed "golden". Through the process of understanding what ChatGPT wrote, rephrasing it, and adding concrete details, the student would probably end up learning something.

In short, via the simplest possible tweak to the assignment, the student could be nudged from pure plagiarism to a combination of plagiarism and learning. More sophisticated tweaks could diminish the extent of plagiarism even more. For example, prompting students to explore their feelings in a detailed, nuanced way might encourage them to question the premise of the poem, at least as it applies to high school (most people don't consider their high school experiences to be pure gold; we can all remember moments that we're delighted to leave behind and, hopefully, forget).

Some final words

Statistics have helped make AI programs like ChatGPT possible. Statistics are also helping us develop programs that can detect whether AI has been used to generate text. Those of us concerned about detecting plagiarism may find that there will always be a kind of stalemate, where better AI text generators are matched by better versions of programs designed to sniff out the use of those generators. Meanwhile, because academic dishonesty is already fairly common, and students have access to a variety of tools, AI programs like ChatGPT may displace older tools without substantially increasing the overall prevalence of cheating. Meanwhile, teachers can tweak assignments and find creative ways of helping students incorporate AI into their work rather than simply trying to prohibit it.

If it turns out I'm wrong, and AI programs do increase cheating on certain kinds of assignments, I suspect that educators will simply evolve higher and/or different standards for student work. For example, consider again ChatGPT's atrocious poem on the power of statistics:

"In numbers and data, lies power untold, / Revealing the truth, in ways unimagined. / With insights to share, that never grow old. / Statistics shape our world, to be better refined."

In a high school English class, I would expect my students to ask whether statistics truly do reveal "the truth", as opposed to a relatively narrow kind of truth. I'd also expect them to reflect on whether statistics reveal truths in "unimagined" ways. (Statistics tend to reveal information in a pretty routine fashion. LeBron needs 117 points to surpass Kareem. We know that because someone counted.)

In a high school science or AP Statistics class, I would expect my students to be even more critical of this poem. Statistics don't reveal the truth but rather highly technical estimates of the truth. The insights that statistics provide often do grow old as new data yield greater accuracy, and new perspectives spur reinterpretations of existing data. (Kudos though to ChatGPT for recognizing that statistics shape our world. That's the point of these newsletters.)

Thanks for reading!

Statisfied

Will ChatGPT Increase Cheating?