This blog was published recently showing that the use of R continues to grow in academia. One of the graphs (Figure 1) showed citations (using google scholar) of different statistical packages in academic papers
At face value, this graph implies a very rapid decline in SPSS use since 2005. I sent a tongue in cheek tweet about this graph, and this perhaps got interpreted that I thought SPSS use was on the decline. So, I thought I’d write this blog. The thing about this graph is it deals with citations in academic papers. The majority of people do not cite the package they use to analyse their data, so this might just reflect a decline in people stating that they used SPSS in papers. Also, it might be that users of software such as R are becomming more inclined to cite the package to encourage others to use it (stats package preference does for some people mimic the kind of religious fervor that causes untold war and misery. Most packages have their pros and cons and some people should get a grip). Also, looking at my annotations on Figure 1 you can see that the decline in SPSS is in no way matched by an upsurge in the use of R/Stata/Systat. This gap implies some mysterious ghost package that everyone is suddenly using but is not included on this graph. Or perhaps people are just ditching SPSS for qualitative analysis or doing it by handJIf you really want to look at the decline/increase of package use then there are other metrics you could use. This article details lots of them. For example you could look at how much people talk about packages online (Figure 2).
Based on this R seems very popular and SPSS less so. However, you can’t really compare R and SPSS here because R is more difficult to use than SPSS (I doubt that this is simply my opinion, I reckon you could demonstrate empirically that the average user prefers the SPSS GUI to R’s command interface if you could be bothered). People are, therefore, more likely to seek help on discussion groups for R than they are for SPSS. It’s perhaps not an index of popularity so much as usability.There are various other interesting metrics discussed in the aforementioned article. Perhaps the closest we can get to an answer to package popularity (but not decline in use) is survey data on what tools people use for data mining. Figure 3 shows that people most frequently report R, SPSS and SAS. Of course this is a snapshot and doesn’t tell us about usage change. However, it shows that SPSS is still up there. I’m not sure what types of people were surveyed for this figure, but I suspect it was professional statisticians/business analysts rather than academics (who would probably not describe their main purpose as data mining). This would also explain the popularity of R, which is very popular amongst people who crunch numbers for a living.
To look at the decline or not of SPSS in academia what we really need is data about campus licenses over the past few years. There were mumblings about Universities switching from SPSS after IBM took over and botched the campus agreement, but I’m not sure how real those rumours were. In any case, the teething problems from the IBM take over seem to be over (at least most people have stopped moaning about them). Of course, we can’t get data on campus licenses because it’s sensitive data that IBM would be silly to put in the public domain. I strongly suspect campus agreements have not declined though. If they have, IBM will be doing all that they can (and they are an enormously successful company) to restore them because campus agreements are a huge part of SPSS’s business.Citation
@online{field2012,
author = {Field, Andy},
title = {SPSS Is Not Dead},
date = {2012-07-20},
url = {https://profandyfield.com/posts/2012_07_20_spss/},
langid = {en}
}