A College Ranking Controversy
Introduction
Although Americans occasionally complain about our obsession with rankings, I'm grateful that someone took time to rank "the top 10 most bizarre movie casting ideas", "the top 10 things you really don't want biting you", and "the top 10 creepiest forests around the world".
At the risk of sounding dismissive, I suspect that a ranking of the creepiest forests (around the world) lacks both merit and influence. In contrast, some ranking systems accused of being meritless are extremely influential. Case in point: the annual U.S. News & World Report Best Colleges Rankings. Colleges and universities routinely commit to improving their U.S. News ranking, target money toward achieving that goal, and express pride when their efforts pay off. One can hardly blame them, given the impact of rankings on applicant pools – a one-rank improvement has been linked to a 1% increase in number of applications, for example, and placing among the top 25 schools in the U.S. News list is associated with a 6 to 10% increase in applications.
This year, Columbia tied for 2nd place in the U.S. News rankings for national universities. Great news for the university, but…in February, Dr. Michael Thaddeus, a Columbia math professor, aired a rigorous investigation which concluded that Columbia's ranking was based in part on stats that are "inaccurate, dubious, or highly misleading." Columbia, unsurprisingly, does not agree.
In March, news outlets such as the New York Times, the New York Daily News, and the Washington Post picked up this story. The journalists refer to it as a "controversy", which seems fair, but I believe it has the potential of becoming a "scandal", because Columbia is so clearly in the wrong, but the university continues to defend itself on the basis of weak and seemingly mendacious argumentation.
This newsletter presents a few of Dr. Thaddeus's findings. I will also delve into broader objections that he and others have raised about the practice of ranking institutions of higher education.
First, a brief overview of how U.S. News determines an institution's rank.
The U.S. News ranking system
My focus here is on national universities. What follows is highly abbreviated; for more details, see here and here.
U.S. News assigns each institution a score on 100-point scale, then ranks the scores.
20% of an institution's score reflects its reputation among higher education administrators (presidents, provosts, and admissions deans of other institutions), while 80% of the score comes from data reported by the institution itself.
Half of the institutional data (i.e., 40%) consists of statistics on student outcomes, including graduation and retention rates (22%), the relationship between predicted and actual graduation rates (8%), graduation rates for Pell Grant recipients (i.e., financially needy students; 5%), and student debt (5%).
The other half of the institutional data consists of faculty resources (20%), financial resources per student (10%), student selectivity (7%), and alumni giving (3%). ("Faculty resources" encompasses variables like class sizes, faculty-student ratios, percentages of faculty with a terminal degree, and so on.)
Although U.S. News explains how it defines each of these variables, some details, particularly the way they combine data to create a single score for each institution, are not shared with the public.
The investigation
Dr. Thaddeus was originally curious as to how Columbia improved its ranking much more than other top-tier universities have been able to do since the U.S. News rankings made their debut in 1988 (at that time, Columbia was ranked #18). He also noticed discrepancies between some of the data that Columbia reported to U.S. News and what he observed on campus. When he began to look more closely at that data, problems arose.
Following is an abridged summary of Dr. Thaddeus's findings. (Malcolm Gladwell discusses some of the findings that I don't cover, but neither one of us capture the necessary granularity of Dr. Thaddeus's work.)
Class size
8% of a school's U.S. News ranking is based on undergraduate class size. Columbia reported, among other things, that 82.5% of its undergraduate classes have fewer than 20 students, while only 8.9% have 50 students or more.
Since Columbia doesn't publicize how it arrived at figures like those, Dr. Thaddeus looked at a complete list of all classes offered by the university in Fall 2019 and Fall 2020 (the time periods used by U.S. News) as well as in Fall 2021. He found that between 62.7% and 66.9% of classes enrolling undergrads at Columbia consisted of fewer than 20 students, a range that's much lower than the 82.5% Columbia reported. (Dr. Thaddeus also found that between 10.6% and 12.4% of classes enrolling undergrads consisted of 50 or more students, which is slightly higher than Columbia's claim of 8.9%.)
If you've spent enough time in higher ed, you can imagine scenarios that muddy the water a bit (e.g., a grad-only course that one undergrad is given special permission to take), but there's not much room for mud here. Dr. Thaddeus was quite meticulous about considering all possibilities. As he concluded in one of his footnotes "There does not seem to be any way of slicing and dicing the data to get anywhere near Columbia's reported figures."
In the New York Times and New York Daily News, Columbia officials said that they stand by the data they provided to U.S. News. The only substantive comment they made regarding class size was that enrollment numbers Dr. Thaddeus obtained from the class directory "aren't certified by the registrar and may deviate from the official count." I find this unconvincing. It's hard to imagine enough deviations of this sort to fully explain the discrepancy between Dr. Thaddeus's calculations and Columbia's reported figures. In any case, since the university has access to registrar-certified stats as well "official counts", they could've just stated directly that they reported the former to U.S. News, while Dr. Thaddeus tallied the latter. Instead, they indicated vaguely (and thus suspiciously) that there "may" be a discrepancy.
Faculty with terminal degrees
3% percent of an institution's ranking is based on the percentage of full-time faculty with a Ph.D. or some other terminal degree (i.e., the highest degree in their field).
Columbia reported that 100% of its full-time faculty have a terminal degree. However, Dr. Thaddeus identified 66 of these 958 individuals whose highest degree is a bachelor's or master's. Although information wasn't readily available for full-time professional school faculty, Dr. Thaddeus noted that even assuming all of them have terminal degrees (which seems unlikely), the highest percentage for the university overall would be 96%. That figure, most probably an overestimate, is still less than the 100% that Columbia reported.
Columbia's response on this point was unconvincing and, in my view, mendacious. For example, according to the New York Times:
"Columbia officials said that Dr. Thaddeus was fixated on the Ph.D., but that in many fields — like writing — that might not be the relevant degree. The 100 percent figure was rounded up, officials said, and they believed they were allowed some leeway in deciding what constituted a terminal degree for particular fields."
Let's break this down:
(a) Dr. Thaddeus was not "fixated" on the Ph.D.; this just happens to be the terminal degree in most fields.
(b) Assuming that Columbia officials actually used the term "relevant", that's not the same as "terminal". You may believe (as many do) that a native speaker of some foreign language can be eminently qualified to teach that language at the college level, even with a master's degree – i.e., you may believe that this person has the "relevant" degree – but it's still not a terminal degree. The person could still, in theory, get a Ph.D. As far as I can tell, Dr. Thaddeus's analyses are grounded in appropriate definitions of "terminal." There may be some squishiness in how to define this term for some fields, but not much.
(c) "Rounding up", with respect to data supplied to U.S. News, means rounding up to the next whole number. 96% does not "round up" to 100%. It's clear from data reported by other universities that they did not round up this way (e.g., you can see percentages between 85% and 89%).
(d) The notion that institutions were "allowed some leeway in deciding what constitutes a terminal degree" falsely implies that U.S. News provided flexible criteria for what constitutes a terminal degree, and that Dr. Thaddeus and Columbia disagreed on how to interpret those criteria. Actually, no criteria were given. This statement, combined with the claim that Dr. Thaddeus was fixated on the Ph.D., gives me the impression that university officials were hinting that the professor doesn't quite "get" how degrees work. However, it's quite clear from his investigation that he understands the relevant distinctions. (No surprise there; he's been a professor for 24 years, three of which were spent as a department chair at Columbia.)
Student outcomes
Under the heading of graduation and retention weights, 17.6% of a school's ranking is based on overall average six-year graduation rates (i.e., the percentage of undergraduates who graduate within six years of matriculation). Columbia reported a 96% six-year graduation rate overall.
However, consistent with U.S. News requirements, this statistic excludes transfer students. At Columbia this is a particularly large group – Dr. Thaddeus notes that in 2020, for example, over 30% of incoming students at Columbia were transfers, a higher percentage than reported by any other Tier I private university. According to the most recently available data, the six-year graduation rate for transfer students at Columbia was 85%.
Unlike Dr. Thaddeus's other concerns, this one doesn't question the accuracy of Columbia’s data. That 96% figure may indeed be correct. The concern here is that the U.S. News ranking system excludes transfer students from the six-year graduation rate calculation. This yields a particularly misleading view of Columbia's performance with respect to undergraduate graduation rates, since the university has proportionally more transfer students than its main competitors do, but graduation rates for its transfer students are lower than for non-transfer students (which is not the case at all universities). In short, there's a problem with how U.S. News operationally defines six-year graduation rates, and that problem helped boost Columbia's ranking.
Other findings
Dr. Thaddeus also pointed to inaccuracies in how Columbia reported percentages of full-time faculty, student-faculty ratios, and amount of spending per student. For details, see his investigation. Suffice to say, Columbia seems to have been in the wrong in each case. In all, Dr. Thaddeus found that 13% of Columbia’s 2nd place ranking is based on inaccurate stats, an additional 10% stems from questionable stats, and 22.6% represents stats that are inflated owing to the exclusion of transfer students.
Why rank colleges and universities?
U.S. News provides a brief rationale for ranking institutions of higher education. What they say in one of their FAQs is that because an undergraduate degree is expensive and important...
"...students and their families should have as much information as possible about the comparative merits of the educational programs at America's colleges and universities. The data U.S. News gathers on colleges – and the rankings of the schools that arise from this data – serve as an objective guide for students and their parents to compare the academic quality of schools."
This is stunningly unpersuasive. A ranking system does not provide "as much information as possible about the comparative merits" of schools. Rather, it assigns each school a number representing that school's position on a list – a list in which each number is meant to represent the entire construct of academic quality.
(Actually, U.S. News does provide useful comparative data on many variables – location, size, tuition, academic programs, etc. It's a great resource in that respect. But rankings are a different creature, and misleading at best.)
What's wrong with the U.S. News ranking system?
1. Academic quality is represented by a single number.
Ranking isn't inherently bad. It would be fine to rank schools on "objective" variables – one ranking for tuition, one for total enrollments, and so on. Those would be fairly accurate and possibly useful rankings, so long as we kept in mind that they only tell part of the story (e.g., schools with the same tuition may differ in their financial aid practices). However, what's being ranked by U.S. News is "academic quality" – this is the term the organization repeatedly uses. Can you summarize academic quality by means of a single number? What is academic quality anyway?
2. The choice and weighting of variables don't represent academic quality well.
U.S. News doesn't define academic quality in any consistent way – which is a good thing, for them, because I can't imagine how any definition of the term could justify the variables they choose to include and exclude, or how to account for the weightings they assign to each variable. 5% of a school's ranking is determined by student debt among graduates, for example. Why is this variable included, and why is it weighted as exactly 5%? U.S. News doesn't say.
Even variables that seem relevant don't hold up well under scrutiny. For example, class size accounts for 8% of a school's ranking. This makes sense if you assume that, ceteris paribus, smaller classes provide better academic experiences than larger ones do, but one could also ask: Why 8%? Why not 7% or 9%? And how do you take into account the fact that in some departments, smaller class sizes may exclude many students owing to class size restrictions? U.S. News doesn't address these kinds of questions.
3. Ease of measurement rather than relevance determines the choice of ranking variables.
The fact that class size is included, but quality of instruction isn't, reminds us that U.S. News measures what's easy to measure, not necessarily what's most relevant. A class of 40 taught by a capable instructor is vastly preferable to a class of 10 taught by a weak one. (In fact, as a former professor of education, I feel like I should stamp my foot here: How can we trust an assessment of academic quality that doesn't include a direct measurement of quality of instruction? The closest U.S. News gets to that are measures of spending on students, student-faculty ratios, etc., but these aren't very close at all.)
Columbia professor Orhan Pamuk has no earned degree higher than a B.A. in journalism. He doesn't have a terminal degree in his field and thus, according to U.S. News, his presence on the Columbia faculty diminishes (by some infinitesimal amount) the academic quality of the institution. However, Mr. Pamuk does have a Nobel Prize in Literature. I mention this to illustrate that although it's relatively easy to count how many faculty have terminal degrees, a more meaningful proxy for academic quality would take into account each professor's eminence in their field. Like quality of instruction, "eminence" is hard to quantify (you wouldn't want to just tally prizes), but it's likely to be more informative than the simply dichotomy between possessing vs. not possessing a terminal degree.
4. The rankings are guilty of false precision.
False precision occurs when the preciseness with which results are described implies more accuracy than could've been attained (except by luck). If a researcher concludes from a large study that 47.21% of undergraduates commit at least one act of academic dishonesty during their college careers, this percentage implies more accuracy than it can deliver, because (a) not everyone admits to cheating or gets caught, and (b) the actual percentage would almost certainly vary from sample to sample and year to year.
U.S. News operationally defines most of its ranking variables (sometimes in complex ways), they assign these variables specific weights, and they apply a complex formula in order to combine variables and calculate scores for each school. Precision can be seen in every step of this process. But there's very little explanation for how this process taps into academic quality.
To illustrate my concern, I'll focus on just one example: alumni giving, which accounts for 3% of a school's ranking. Here's part of how U.S. News defines it:
"This is the nonweighted mean percentage of undergraduate alumni of record who donated money to the college or university. The percentage of alumni giving serves as a proxy for how satisfied students are with the school...
The alumni giving rate is calculated by dividing the number of alumni donors during a given academic year by the number of alumni of record for that same year. The two most recent years of alumni giving rates available are averaged...".
The precision of this method and its outcomes buys us nothing, because it doesn't tap into academic quality. If top-tier schools tend to enroll more affluent students, and if these schools have stronger alumni offices and networks, then they're likely to get more donations. As for alumni giving being a proxy for student satisfaction, even if this were the case, student satisfaction may not be a proxy for academic quality. (Or perhaps it only works well as a proxy at those top-tier schools where students are more likely to be highly academically motivated.) I'm speculating a lot here, which is a bad sign. If alumni giving were clearly an indicator of academic quality, I'd have less to say.
Is a better approach to ranking possible?
Probably not.
A set of rankings represents what statisticians call an ordinal scale, meaning that the divisions of the scale are arranged from low to high on some dimension. Practically speaking, one could impose many different ranking systems on the same data. With respect to academic quality, you could give each school a grade ranging from A to F. Or, you could classify each school as high, medium, or, low. U.S. News happens to use whole numbers for its rankings. Thus, it's committed to the assumption that the school ranked #27 is better, academically speaking, than the school that's ranked #28. This rankles critics (no pun intended), because we all know that academic quality, whatever that is exactly, is a very complex sort of thing. For example, in my opinion it doesn't make much sense to consider academic quality apart from the individual student. A student with a particular set of interests and needs may have a much better academic experience at school #28 than at school #27 – or even school #10.
It's tempting to say that the problem with the U.S. News rankings is their use of whole numbers. Saying that #28 is better than #27 seems ridiculous, but if, for example, you collapsed across whole numbers and classified each school as relatively high, medium, or low in academic quality, you might find it informative to contrast those in the high versus low categories. Princeton (ranked #1) is better than Lesley University (ranked #249), right?
Well, yes and no. If you don't consider individual student needs, then yes, Princeton is academically superior to Lesley. That's easily demonstrated. But a particular student who flourishes at Lesley, acquiring knowledge and intellectual perspective and social maturity there, might gain virtually nothing from Princeton if they found classes and classmates so challenging that they experienced constant anxiety. I can tell you too, because I live nearby, that the academic merits of Lesley include close proximity to Harvard (which basically surrounds Lesley) and MIT (about a 20 minute walk). That #249 statistic doesn't capture all of the academic and cultural opportunities available to Lesley students. So, again, it's fine to say Princeton is the better school, generally speaking, but it's not better for everyone, and it has opportunities (for everyone) that Princeton doesn't.
Conclusion
Columbia is a great university, as Dr. Thaddeus notes, but it can be faulted for (a) participating in a farcical game, and then (c) failing to abide by the rules of that game. The U.S. News rankings, and the data that helped Columbia achieve a #2 ranking, are grounded in statistics that are inaccurate, misleading, and/or irrelevant.
Students, families, and academic counselors have many resources to draw upon for comparative information on colleges. I would include U.S. News materials among those resources. But the rankings that U.S. News and others generate are deeply misleading.