There’s a lot of buzz around bayesian data analysis (BDA) in psychology blogs, social media, and journal articles. For instance, in 2015 the APS Observer ran three columns dedicated to BDA in consecutive issues of the journal (Gallistel, 2015a, b, & c), and browsing the latest issues of Psychonomic Bulletin and Review gives an impression of increased interest in the topic.
Bayesian data analysis is more than bayes factors
However, it appears that there is an imbalance in what many beginning bayesian data analysts think about BDA. From casual observation and discussions, I’ve noticed a tendency for people to equate bayesian methods with computing bayes factors; that is, testing (usually null) hypotheses using bayesian model comparison.
I don’t have good data on people’s impressions of what BDA is, but here’s another anecdote. At a recent conference on Bayesian statistics Mark Andrews summarized his experiences teaching BDA workshops for social scientists. The talk was quite interesting, and what particularly picked my curiosity was his comment that for many–if not most–workshop participants, bayesian data analysis meant hypothesis testing with bayes factors (at 20 minutes in the linked video). (As an aside, he also noted that Stan has now superseded JAGS and BUGS as the preferred choice for a probabilistic modeling language. Go Stan!)
This imbalance or conflation of bayes factors and bayesian data analysis (if it is real and not merely bias in my observations!) is quite disappointing because BDA is a vast field of awesome methods, and bayes factors (BF) are only one thing that you can do with it1. In fact, many textbooks on BDA mention BFs only in footnotes (Gelman et al., 2013; McElreath, 2016). I’ve also written about BDA on my blog about a half-dozen times, but only once about bayes factors (Vuorre, 2016).
Further, it is really only this one method that people bicker over on social media: the bayes vs. frequentism argument usually turns into a p-values vs. bayes factors argument. The tedium of this argument (there really aren’t good reasons to prefer p-values) may even give out the impression that BDA is tedious and limited to model comparison and hypothesis testing problems. It isn’t! Bayes has benefits over frequentism that reach far beyond the p-vs-bf issue.
Here’s a practical example: It is well known that estimating generalized linear mixed models is kind of difficult. In fact, maximum likelihood methods routinely fail, especially when data are sparse of parameters are plenty (you’ve heard of multilevel models not converging, right? That’s the issue.). However, bayesian methods (via MCMC for example) usually have no problem estimating these models in situations when maximum likelihood fails! This benefit of bayes over frequentism (only the first thing to come to mind) doesn’t usually appear in the tedious p-vs-bf arguments, although one could argue that its practical implications are greater.
Reasons for thinking that BDA is BF
I suspect that one factor contributing to the apparent conflation of BDA and BFs is that there are vocal groups of psychological scientists doing interesting and important work promoting the use of Bayes Factors for hypothesis testing, and bayesian methods more generally (eg. Dienes, 2015; Ly, Verhagen, & Wagenmakers, 2015; Morey, Romeijn, & Rouder, 2016; Rouder, Morey, Verhagen, Province, & Wagenmakers, 2016). A quick reading of some of these texts may give the (false) impression that bayes factors are a larger portion of BDA than what they actually are. I think the same goes with the APS Observer columns mentioned above. To be completely clear, these papers are not about, nor do they say, that BDA is BF. I’m merely pointing out that in the psychological literature, (important!) papers about bayesian methods focusing on BF vastly overshadow in number papers that discuss other features of BDA.
But the real root cause for this conflation is probably the fetish-like desire of hypothesis tests in psychological science.
A modest proposal
If now is the time to move toward a new statistical paradigm in psychology, we could take the opportunity to emphasize not only bayesian hypothesis testing, but the importance of modeling, estimation and bayesian methods more generally. As Dr. Andrews noted, BDA could instead be thought of as flexible probabilistic modeling.
To be sure, I think a bayes factor can be a truly great tool when the competing models are well specified. I haven’t yet implemented a bayes factor method for my bayesian multilevel mediation package (Vuorre & Bolger, 2017), but might include one in the future. One difficulty here is specifying the competing models in a meaningful way.↩