Word Counts over Time

word usage
history of analytic
Author
Affiliation

University of Michigan

Published

February 7, 2025

Abstract

Some graphs about changes in word frequencies over 1980-2019 in twenty leading philosophy journals.

Recently I downloaded data showing how often words were used in articles in twenty prominent philosophy journals. Mostly this came from JSTOR’s Data for Researchers program, though in a few cases I downloaded the papers manually and used pdftools to extract the words. The journals I’m using are listed in Table 1.

Table 1: The journals used in this post
Journal Articles Avg. Length
American Philosophical Quarterly 1170 7857.2
Analysis 2225 2632.9
Australasian Journal of Philosophy 1361 7864.5
British Journal for the Philosophy of Science 1056 10071.4
Canadian Journal of Philosophy 1067 9993.7
Ethics 1037 10406.9
Journal of Philosophical Logic 1007 10246.4
Journal of Philosophy 1181 8995.6
Linguistics and Philosophy 711 14176.4
Mind 1068 9755.0
Monist 1255 8128.5
Noûs 1152 10611.6
Pacific Philosophical Quarterly 1011 9363.2
Philosophical Quarterly 1079 7725.2
Philosophical Review 510 13816.7
Philosophical Studies 3407 8352.3
Philosophy & Public Affairs 521 11474.5
Philosophy and Phenomenological Research 2143 8384.2
Philosophy of Science 1955 7757.7
Synthese 3653 10010.5

This post just runs through a few words whose frequency has notably changed over the years between 1980 and 2019. For each word I’ll display three graphs.

  1. How often that word appears as a percentage of all words in all twenty journals that year. That’s the simplest measure of word frequency, but it can be a bit misleading. Sometimes it looks like a word is being used very widely, but it’s just that it appears several hundred times in one article. So I’ll supplement it with two more graphs.
  2. The second graph measures how often that word is in an article at all. More precisely, it calculates all the word-article pairs, where the word appears at least once in the article, and for each year, works out the percentage of such pairs which have the target word as the word in question.1
  3. The third graph measures how often the word appears frequently, meaning ten or more times, in an article. It’s interesting to know not just how many articles mention Kant, but how many articles include more than a passing reference to Kant. Any threshold here is arbitrary, but I’ve picked ten. As with the second graph, I’ve calculated for each year all word-article pairs, where the word appears ten times or more in the article, then for each word and year, calculated the proportion of pairs from that year which have this word as the first element.

1 Why not just show the percentage of articles which contain the word? Because articles have been getting much longer, and the more convoluted thing I’m doing controls for that.

In practice, the graphs look like this. (The word being tracked is in the title of the graph.)

Frequency data for ‘Lewis’

The high mark on the left graph is in 2004. That year there were 1,368 occurrences of “Lewis” in the journals, out of a total of 6,233,344 words. So the word “Lewis” appears a little over 2 times every 10,000 words.

The high mark on the middle graph is in 2019. That year the word “Lewis” appears in 261 articles. There are 2,130,526 word-article pairs that year such that the word appears at least once in the article. So a bit over 1.2 of those pairs per 10,000 involve “Lewis”.

The high mark on the right graph is back in 2004. That year the word “Lewis” appears ten times or more in 38 articles. There are 92,126 word-article pairs that year such that the word appears at least ten times in the article. So a bit over 4 of those pairs per 10,000 involve “Lewis”.

So those are the kinds of things we’ll show with these graphs. The results show us something about changes in philosophy over time. Some of the changes are purely stylistic, as with all the talk in recent years about robust challenges. But some reflect more substantive changes in which philosophers, and which philosophical ideas, are being talked about.

If you’d like to see more of these graphs, they are pretty easy to generate once the setup is done, so let me know and I’ll either post them to social media, or add some more to this post.

I’ll start with some familiar names. The graphs for “Lewis” are a bit misleading, because you might think they are exclusively about David Lewis. And while they are largely about him, there are plenty of references to Karen Lewis, and Peter Lewis, and several other philosophers. So I’ll try to stick to names where it is clearer which philosopher is being referred to. This is hard: “Fodor” usually means Jerry, but sometimes means Janet; “Williamson” usually means Tim, but often means Jon, and sometimes has another meaning. Still, it isn’t worth showing the graphs for “Smith”, “Jones”, “Anderson” or (though this is a bit of a special case), “Harman”. It’s also not worth showing graphs for people whose name is a frequently used word like “Fine” or “Field”. Because the data I’m using treats words that are broken across lines as separate words, that also makes it impossible to get an accurate count of how often “Sider” is used; too often it is the second half of a word that was hyphenated across a line.2

2 I don’t think there were 36 articles talking about Ted Sider in 1980, but the word ‘sider’ turns up in the database 36 times for that year.

The names that follow are in order of the average year in which the name appears. I’ve used weighted order of word counts, so it is sometimes thrown off by words appearing very often. (That’s why, I think, ‘Sider’ appears so early.) It’s a very male list, in part because of how imbalanced philosophy is, and in part because of how many prominent women share names with prominent men. (This is sometimes, but definitely not always, because they are related.)

Figure 1: Frequency data for ‘Popper’
Figure 2: Frequency data for ‘Kripke’
Figure 3: Frequency data for ‘Putnam’
Figure 4: Frequency data for ‘Chisholm’
Figure 5: Frequency data for ‘Jackson’
Figure 6: Frequency data for ‘Schiffer’
Figure 7: Frequency data for ‘Fodor’
Figure 8: Frequency data for ‘Marx’
Figure 9: Frequency data for ‘Gauthier’
Figure 10: Frequency data for ‘Hume’
Figure 11: Frequency data for ‘Wittgenstein’
Figure 12: Frequency data for ‘Dummett’
Figure 13: Frequency data for ‘Husserl’
Figure 14: Frequency data for ‘Frege’
Figure 15: Frequency data for ‘Nozick’
Figure 16: Frequency data for ‘Wright’
Figure 17: Frequency data for ‘Rawls’
Figure 18: Frequency data for ‘McDowell’
Figure 19: Frequency data for ‘Brandom’
Figure 20: Frequency data for ‘Stalnaker’
Figure 21: Frequency data for ‘DeRose’
Figure 22: Frequency data for ‘Langton’
Figure 23: Frequency data for ‘Chalmers’
Figure 24: Frequency data for ‘Haslanger’
Figure 25: Frequency data for ‘Thomson’
Figure 26: Frequency data for ‘Millikan’
Figure 27: Frequency data for ‘Nussbaum’
Figure 28: Frequency data for ‘Korsgaard’
Figure 29: Frequency data for ‘Pritchard’
Figure 30: Frequency data for ‘Schaffer’
Figure 31: Frequency data for ‘Scanlon’
Figure 32: Frequency data for ‘Weatherson’
Figure 33: Frequency data for ‘Hawthorne’
Figure 34: Frequency data for ‘MacFarlane’
Figure 35: Frequency data for ‘Fricker’

Next I’ll run through some words associated with philosophical topics. Again, sometimes there is danger of multiple meanings. “Conception” is used both in papers about conceptual analysis, and papers about abortion. “Internalism” is used with many different meanings. As with the names, these are ordered by the average year in which the word appears.

Figure 36: Frequency data for ‘capitalist’
Figure 37: Frequency data for ‘utilitarian’
Figure 38: Frequency data for ‘connectionist’
Figure 39: Frequency data for ‘statement’
Figure 40: Frequency data for ‘person’
Figure 41: Frequency data for ‘reference’
Figure 42: Frequency data for ‘description’
Figure 43: Frequency data for ‘conception’
Figure 44: Frequency data for ‘physical’
Figure 45: Frequency data for ‘analysis’
Figure 46: Frequency data for ‘individual’
Figure 47: Frequency data for ‘cause’
Figure 48: Frequency data for ‘matter’
Figure 49: Frequency data for ‘basic’
Figure 50: Frequency data for ‘utterances’
Figure 51: Frequency data for ‘priori’
Figure 52: Frequency data for ‘vague’
Figure 53: Frequency data for ‘utterance’
Figure 54: Frequency data for ‘grounds’
Figure 55: Frequency data for ‘supervenience’
Figure 56: Frequency data for ‘causal’
Figure 57: Frequency data for ‘intrinsic’
Figure 58: Frequency data for ‘knows’
Figure 59: Frequency data for ‘knowledge’
Figure 60: Frequency data for ‘vagueness’
Figure 61: Frequency data for ‘ground’
Figure 62: Frequency data for ‘internalist’
Figure 63: Frequency data for ‘intuition’
Figure 64: Frequency data for ‘statue’
Figure 65: Frequency data for ‘internalism’
Figure 66: Frequency data for ‘evidence’
Figure 67: Frequency data for ‘externalist’
Figure 68: Frequency data for ‘reasons’
Figure 69: Frequency data for ‘externalism’
Figure 70: Frequency data for ‘epistemic’
Figure 71: Frequency data for ‘disagreement’
Figure 72: Frequency data for ‘normative’
Figure 73: Frequency data for ‘zombie’
Figure 74: Frequency data for ‘doxastic’
Figure 75: Frequency data for ‘contextualist’
Figure 76: Frequency data for ‘contextualism’
Figure 77: Frequency data for ‘grounding’
Figure 78: Frequency data for ‘credence’
Figure 79: Frequency data for ‘credences’
Figure 80: Frequency data for ‘slur’

Finally, I’ll go through some words that I don’t think really tell us much about the content of philosophy papers, but do tell us a lot about their style. I hadn’t really noticed how stuffy some of the language in 1980s philosophy journals was, or how combative the more recent language has become. (These are more roughly ordered, so some natural terms can go together.)

Figure 81: Frequency data for ‘men’
Figure 82: Frequency data for ‘his’
Figure 83: Frequency data for ‘himself’
Figure 84: Frequency data for ‘shall’
Figure 85: Frequency data for ‘which’
Figure 86: Frequency data for ‘upon’
Figure 87: Frequency data for ‘would’
Figure 88: Frequency data for ‘quite’
Figure 89: Frequency data for ‘really’
Figure 90: Frequency data for ‘argue’
Figure 91: Frequency data for ‘account’
Figure 92: Frequency data for ‘accounts’
Figure 93: Frequency data for ‘robust’
Figure 94: Frequency data for ‘challenge’
Figure 95: Frequency data for ‘problematic’
Figure 96: Frequency data for ‘response’
Figure 97: Frequency data for ‘target’
Figure 98: Frequency data for ‘worry’
Figure 99: Frequency data for ‘she’