A few thoughts on CSRankings.org

(Warning: nerdy inside-baseball academic blog post follows. If you’re looking for exciting crypto blogging, try back in a couple of days.)

If there’s one thing that academic computer scientists love (or love to hate), it’s comparing themselves to other academics. We don’t do what we do for the big money, after all. We do it — in large part — because we’re curious and want to do good science. (Also there’s sometimes free food.) But then there’s a problem: who’s going to tell is if we’re doing good science?

To a scientist, the solution seems obvious. We just need metrics. And boy, do we get them. Modern scientists can visit Google Scholar to get all sorts of information about their citation count, neatly summarized with an “H-index” or an “i10-index”. These metrics aren’t great, but they’re a good way to pass an afternoon filled with self-doubt, if that’s your sort of thing.

But what if we want to do something more? What if we want to compare institutions as well as individual authors? And even better, what if we could break those institutions down into individual subfields? You could do this painfully on Google Scholar, perhaps. Or you could put your faith in the abominable and apparently wholly made-up U.S. News rankings, as many academics (unfortunately) do.

Alternatively, you could actually collect some data about what scientists are publishing, and work with that.

This is the approach of a new site called “Computer Science Rankings”. As best I can tell, CSRankings is largely an individual project, and doesn’t have the cachet (yet) of U.S. News. At the same time, it provides researchers and administrators with something they love: another way to compare themselves, and to compare different institutions. Moreover, it does so with real data (rather than the Ouija board and blindfold that U.S. News uses). I can’t see it failing to catch on.

And that worries me, because the approach of CSRankings seems a bit arbitrary. And I’m worried about what sort of things it might cause us to do.

You see, people in our field take rankings very seriously. I know folks who have moved their families to the other side of the country over a two-point ranking difference in the U.S. News rankings — despite the fact that we all agree those are absurd. And this is before we consider the real impact on salaries, promotions, and awards of rankings (individual and institutional). People optimize their careers and publications to maximize these stats, not because they’re bad people, but because they’re (mostly) rational and that’s what rankings inspire rational people do.

To me this means we should think very carefully about what our rankings actually say.

Which brings me to the meat of my concerns with CSRankings. At a glance, the site is beautifully designed. It allows you to look at dozens of institutions, broken down by CS subfield. Within those subfields it ranks institutions by a simple metric: adjusted publication counts in top conferences by individual authors.

The calculation isn’t complicated. If you wrote a paper by yourself and had it published in one of the designated top conferences in your field, you’d get a single point. If you wrote a paper with a co-author, then you’d each get half a point. If you wrote a paper that doesn’t appear in a top conference, you get zero points. Your institution gets the sum-total of all the points its researchers receive.

If you believe that people are rational actors optimize for rankings, you might start to see the problem.

First off, what CSRankings is telling us is that we should ditch those pesky co-authors. If I could write a paper with one graduate student, but a second student also wants to participate, tough cookies. That’s the difference between getting 1/2 a point and 1/3 of a point. Sure, that additional student might improve the paper dramatically. They might also learn a thing or two. But on the other hand, they’ll hurt your rankings.

(Note: currently on CSRankings, graduate students at the same institution don’t get included in the institutional rankings. So including them on your papers will actually reduce your school’s rank.)

I hope it goes without saying that this could create bad incentives.

Second, in fields that mix systems and theory — like computer security — CSRankings is telling us that theory papers (which typically have fewer authors) should be privileged in the rankings over systems papers. This creates both a distortion in the metrics, and also an incentive (for authors who do both types of work) to stick with the one that produces higher rankings. That seems undesirable. But it could very well happen if we adopt these rankings uncritically.

Finally, there’s this focus on “top conferences”. One of our big problems in computer science is that we spend a lot of our time scrapping over a very limited number of slots in competitive conferences. This can be ok, but it’s unfortunate for researchers whose work doesn’t neatly fit into whatever areas those conference PCs find popular. And CSRankings gives zero credit for publishing anywhere but those top conferences, so you might as well forget about that.

(Of course, there’s a question about what a “top conference” even is. In Computer Security, where I work, CSRankings does not consider NDSS to be a top conference. That’s because only three conferences are permitted for each field. The fact that this number seems arbitrary really doesn’t help inspire a lot of confidence in the approach.)

So what can we do about this?

As much as I’d like to ditch rankings altogether, I realize that this probably isn’t going to happen. Nature abhors a vacuum, and if we don’t figure out a rankings system, someone else will. Hell, we’re already plagued by U.S. News, whose methodology appears to involve a popcorn machine and live tarantulas. Something, anything, has to be better than this.

And to be clear, CSRankings isn’t a bad effort. At a high level it’s really easy to use. Even the issues I mention above seem like things that could be addressed. More conferences could be added, using some kind of metric to scale point contributions. (This wouldn’t fix all the problems, but would at least mitigate the worst incentives.) Statistics could perhaps be updated to adjust for graduate students, and soften the blow of having co-authors. These things are not impossible.

And fixing this carefully seems really important. We got it wrong in trusting U.S. News. What I’d like is this time for computer scientists to actually sit down and think this one out before someone imposes a ranking system on top of us. What behaviors are we trying to incentivize for? Is it smaller author lists? Is it citation counts? Is it publishing only in a specific set of conferences?

I don’t know that anyone would agree uniformly that these should be our goals. So if they’re not, let’s figure out what they really are.