In a recent article and then blog post I put forward a challenge to highly influential (e.g. here and here) hierarchical Bayesian models of psychosis in computational psychiatry. Phil Corlett—one of the most prominent champions of such models—has offered a compelling response to this challenge. What follows is partly an elaboration and defence of the claims made within those earlier pieces and partly a response to Phil’s response.
My challenge targeted the two core aspects of hierarchical Bayesian models: their hierarchical component and their Bayesian component. I don’t think that proponents of predictive coding have adequately dealt with the first of these challenges (including Phil’s response), but here I will restrict my attention to the second issue.
I will be developing the ideas in this blog piece at much greater length in work I am doing in collaboration with Stephen Gadsby (who can be found here). Here I will just briefly set out how the terrain looks to me and explain why I am not convinced by Phil’s response to this part of my paper.
Is Belief Fixation Bayesian?
“In the normally functioning brain information from different sources is combined in a statistically optimal manner” (Karl Friston and Chris Frith).
“It has been said that man is a rational animal. All my life I have been searching for evidence which could support this” (Bertrand Russell).
You can’t throw a rock these days without hitting the Reverend Thomas Bayes. This ubiquity arises from two sources that seem—superficially, at least—to be in tension with one another.
The first source of Bayes mania comes from what might be called “Bayes advocacy.” Bayes advocacy is the endless stream of popular books and arguments prescribing Bayesian inference as the ideally rational procedure for inference under uncertainty. In this context, Bayesian updating is how we would form and evaluate beliefs if only we weren’t the stupid biased tribal primates that we happen to be. Conformity to Bayes’ theorem describes how science should be done, how predictions should be generated, how we should change our minds in response to new evidence, with the “should” here a stark reminder of the fact that we generally do not conform to this ideal.
The second source of Bayes mania comes from the “Bayesian revolution” in recent cognitive science. In this context, Bayesian inference describes how we do in fact process information. In recent decades, people have drawn on Bayesian statistics and decision theory to explain—well, just about everything: learning, perception, sensorimotor control, planning, memory, language processing, and on and on. This revolution has been embraced by an explosion of recent work in computational psychiatry offering Bayesian models of addiction, depression, anxiety, autism spectrum disorder, borderline personality disorder—and, of course, psychosis, which constitutes the target of my article.
What is going on here? One response is that these Bayesian models in cognitive science typically describe sub-personal information-processing mechanisms that are in fact plausible candidates for approximately optimal inference and decision-making procedures. Our perceptual systems, for example, really do seem to be spectacularly good at overcoming the noise and uncertainty in proximal sensory inputs to reveal the structured world responsible for generating them.
As I point out in my paper, however, this story will not work for the case of belief fixation. Human beliefs seemed to be cooked in a stew of bias, suboptimality, motivated reasoning, self-deception, denial, social signalling, and so on, which seem—again, superficially, at least—to systematically bias them away from optimal inference. This creates at least a prima facie problem for Bayesian models of delusion: if belief fixation is not Bayesian, we should not seek to explain delusions in terms of dysfunctions in a process of Bayesian inference.
Before I turn to Phil’s response to this argument, it is worth clarifying what the argument is and clearing away some potential confusions.
First, I am not saying that delusional beliefs themselves pose a problem for the optimality assumptions in Bayesian models. According to the models I criticise, such beliefs arise from dysfunctions in a process of Bayesian inference, so it should hardly be surprising that they seem to deviate from inferential norms. My argument is that there is little reason to think that belief fixation in general—in the “normally functioning brain”—is Bayesian. If that’s right, we obviously shouldn’t try to explain delusional beliefs in terms of dysfunctions in Bayesian inference.
Second, I was pretty careful to frame my paper so that the argument is not that there is direct evidence against a Bayesian view of belief fixation, but rather that there is little persuasive evidence or reason to endorse such a view (see below). I put forward several arguments in defence of this claim:
- Most of the evidence for the Bayesian brain exists in the domain of sensorimotor processing, broadly construed, not belief fixation or higher cognition.
- There are systematic features of human belief fixation that one would not expect if it were approximately Bayes’ optimal. I highlight three in the paper: confirmation bias, motivated reasoning, and the backfire effect.
- Optimality assumptions in cognitive science in general are dubious. Evolution builds kluges, not optimisers.
- Beliefs—and especially the kinds of beliefs that we spend time broadcasting to others—have social functions as well as inferential functions. That is, many of our beliefs are designed for social consumption. Bayesian theories as currently formulated do not engage this important feature of belief fixation.
In any case, Phil Corlett does not think that there is a problem here. In fact, in my experience talking to neuroscientists and psychiatrists who are committed to the Bayesian brain hypothesis, the very idea that there is a challenge here at all is scoffed at, as though the worry rests on a simple misunderstanding of Bayes’ theorem. Indeed, an anonymous reviewer for my paper issued the following response to my argument:
“The notion of non-Bayesian belief ﬁxation is itself a category error. Indeed, mathematically, the complete class theorem precludes any decision from being non-Bayesian.”
That is, not only is human cognition in fact Bayesian; it is a conceptual truth that it is Bayesian!
I will return to this claim shortly.
Phil himself offers two responses to my challenge.
The first is that the hierarchical Bayesian models he defends involve *approximate* Bayesian inference: “none of these models demand optimally Bayesian inference. As Dan says, they involve “(approximate) Bayesian inference,” which “explicitly allow[s] for deviations from optimality.”
The second response is to point towards Bayesian models of the various psychological phenomena—motivated reasoning, bias, and so on—that I claimed constitute problems for Bayesian belief fixation.
Before I explain why I am not convinced by either of these responses, let me unpack them.
First, then, exact Bayesian inference is typically computationally intractable. As such, the brain—insofar as it is a Bayesian inference machine—has to rely on approximation algorithms. The most common of such algorithms involve either sampling or variational methods. Predictive coding involves the latter. Importantly, these methods will systematically deviate from exact Bayesian inference in reliable ways. (That is why they are *approximate*, not exact, methods). As such, pointing to deviations from exact Bayesian inference is irrelevant: nobody ever claimed that the brain does exact Bayesian inference anyway.
I am not convinced by this response for the following reason: approximate Bayesian inference is still approximate *Bayesian inference*. The whole point of such algorithms is to approximate the optimal inferential profile found in exact Bayesian inference. Although there is some interesting work showing how the specific character of certain approximation algorithms result in some of the deviations from optimality that we observe in human cognition, merely pointing to the approximate character of the inference procedures is not sufficient to address the challenge. In what specific way does the variational method used in predictive coding predict phenomena such as self-deception, denial, social signalling, motivated reasoning, and so on?
This brings us to Phil’s second response: namely, that there are in fact Bayesian models of such psychological phenomena. I will not wade into the specifics of such models here, except to say that I find most such attempts very unconvincing. For example, many Bayesians seem to conflate being reluctant to revise beliefs that we have strong confidence in with phenomena such as confirmation bias and motivated reasoning. These things bear only a superficial resemblance to each other, however. Specifically, most Bayesian models neglect the fact that we are often highly emotionally—and tribally—invested in our beliefs. When a passionate argument breaks out, it doesn’t typically arise from a clash of priors; it arises from a clash of identities.
Here, however, I want to address a deeper point, which is this: one can model anything—any psychological, behavioural, or inferential phenomenon—in terms of Bayesian inference. This is the point noted by the anonymous referee quoted above. As that reviewer put it,
“No matter how odd or untypical a decision, choice or inference, there is some set of prior beliefs that renders it Bayes optimal.”
The correct lesson to draw from this, however, is not the one that the reviewer apparently draws—namely, that therefore every decision, choice or inference is Bayesian. Instead, the take-away is this: the mere fact that one can provide a Bayesian model of some phenomenon X provides no evidence at all in favour of that model. This is the worry that the philosopher Clark Glymour expresses when he writes, “I know of no Bayesian psychological prediction more precise than “We can model it.””
Given this, the relevant question is this: when are Bayesian models appropriate and useful—when are they genuinely enlightening—and when do they simply provide post hoc just-so stories?
It is here where I think the issues of optimality and rationality become relevant. Because Bayes’ theorem describes the optimal method for belief updating under conditions of uncertainty, evidence of optimality in a cognitive domain provides prima facie evidence for a Bayesian model of that domain. In fact, this is exactly how proponents of the Bayesian brain hypothesis argue. The quote from Friston and Frith above, for example, reads like this in full:
“In the normally functioning brain information from different sources is combined in a statistically optimal manner. The mechanism for achieving this is well captured in a Bayesian framework.”
Likewise, Knill and Pouget’s classic paper on the Bayesian brain hypothesis supports the hypothesis by reference to “a growing body of evidence that human perceptual computations are “Bayes’ optimal’.” The evidence that “human observers behave as optimal Bayesian observers” is claimed to have “fundamental implications for neuroscience.” They claim that “the most persuasive evidence for the Bayesian coding hypothesis comes from work on sensory cue integration,” where humans have been shown to optimally weigh information from different modalities in a form predicted by Bayesian models. This is also the evidence appealed to by Friston and Frith in defence of their claim quoted above.
You can’t have it both ways, however. If evidence of approximately optimal inference is taken as evidence for the Bayesian brain, then evidence of systematic apparent deviations from optimality should at least be taken as prima facie evidence against it. (If observing E raises the probability of H, observing -E must lower it).
This is the sense in which phenomena like confirmation bias, motivated reasoning, the backfire effect, self-deception, denial, cognitive biases, and so on, are relevant to Bayesian theories of human belief fixation. Even if one can produce Bayesian models of such phenomena, they are highly unexpected on the assumption that belief fixation is Bayesian.
That’s how things look to me, at least.
One final thing: I increasingly think that many of the points above are in a sense orthogonal to the hierarchical Bayesian models of conditions such as psychosis championed by Phil and others. Specifically, I think that many of the attractive features of such models for which (as Phil points out) there is some compelling evidence—namely, differential reliance on sensory evidence and prior expectations, differential responsiveness to violated expectations, and so on—can be preserved even when one takes them out of the context of a commitment to predictive processing as a Grand Unified Theory of Everything.
Anyway, that’s a topic for another day.