In this concluding post I’ll briefly summarise what I have argued so far and then outline three pressing research questions moving forward.
My aim in the thesis as a whole has been to outline and defend a theory of some important aspects of mental representation based on the idea of idealised models in the brain that facilitate a kind of flexible predictive capacity.
I organised this theory around three insights that I extracted from the work of Kenneth Craik in Chapter 3.
The first is that our capacity to represent the world in perception, imagination, and (some aspects of thought) should be understood in terms of the representational paradigm of models. Models are representations that share an abstract structure with their targets. They are idealised structural surrogates for target domains. I argued that these models should be understood in terms of the concept of deep causal generative models from computational neuroscience and machine learning—that is, models that capture the causal-statistical structure of those domains responsible for generating the data to which the brain is exposed.
The second is that the core function of these models—the chief thing that they enable us to do that explains their adaptive and constructive significance—is prediction: predicting what sensory information we’re likely to receive, what’s likely to happen next, the likely outcomes of our possible interventions on the world, and so on. I related this idea to theories such as predictive processing in cognitive neuroscience, according to which cognitive processes are organised around a general imperative to minimize prediction error, as well as a more general conception of the neocortex as a general-purpose learning device operating according to generative model-based predictive processing.
Finally, I argued that this capacity for predictive modelling should ultimately be situated in a broadly cybernetic understanding of brain function, in which we model the world from the perspective of our contingent physiological needs as particular kinds of creatures. That is, the flexible predictive capacity facilitated by deep generative models in animals like us is subservient to a deeper imperative to regulate our internal environments—to keep us within that window of viability conducive to optimal functioning. As such, our mental models are selective and idealised, tracking those aspects of the world’s causal and statistical structure as it relates and matters to us.
I am acutely away just how schematic this “theory” is, how many open questions remain, and how much it owes to the brilliant work of others. If I have an excuse, it’s that this is just a PhD thesis, and I passionately subscribe to the view that the best kind of doctoral thesis is a finished doctoral thesis, not a perfect but imaginary ground-breaking magnum opus of the sort that I hoped I would write when I first started a PhD.
In any case, I’ll conclude by raising three important research questions to address in future work. These are not supposed to be exhaustive. They are just questions that interest me.
- How many models?
First, then, I’ve argued that our capacity to represent the world in perception, imagination, and some aspects of thought is grounded in deep causal generative models. This raises the question: how many models does each of us command?
One can imagine a continuum of answers here, ranging from the view that Steven Horst calls “cognitive pluralism,” according to which we navigate the world with a large number of distinct, largely compartmentalized models of different domains, to the other extreme, according to which we each construct and exploit just one extremely complex but nevertheless unified model of reality.
Proponents of predictive processing typically opt for the latter of these extremes. Andy Clark, for example, declares that “there is always a single hierarchical generative model in play,” and Micah Allen and Karl Friston contend that “the entire neuronal architecture within the brain becomes one forward or generative model with a deep hierarchical architecture.”
Of course, reference to a “single” model here could just be a reference to all the information an animal commands about its world, no matter how this information is organised. Nevertheless, proponents of predictive processing seem to have something more substantive in mind.
First, it is widely held in the literature that it is meaningful to talk of the representational hierarchy. At the lowest levels of this hierarchy are predictions of spatiotemporally fine-grained phenomena across the different modalities. As you go up the hierarchy, you then reach representations that are increasingly abstract multimodal and amodal, capturing regularities at larger spatiotemporal scales.
I have argued elsewhere that the appeal to a single hierarchy of this kind is difficult to make sense of:
- What’s at the top of the hierarchy?
- By what principle is the hierarchy organised? If it’s by the contents of representations at different levels, as it seems it must be if it’s a hierarchical generative model where higher levels estimate features of the world responsible for generating regularities registered at the level below, it runs into the problem that we can form highly abstract “high-level” thoughts about the same phenomena represented at the lowest levels. Indeed, human thought is domain general: we can think and reason about anything.
Second, another way of making sense of the commitment to a single generative model is that the information we command about the world is in some important sense unified. For example, although we might represent different domains, these representations are nevertheless not “informationally encapsulated,” insofar as all the information that we command is in principle relevant to all the other information that we command.
Again, though, I’m not sure that’s right.
It seems we navigate the world with multiple models of different content domains—an idea that goes back in many ways to Marvin Minsky. Consider, for example, the influential idea that commonsense reasoning is organised around intuitive theories of domains such as physics and psychology, which—as some authors recently put it— “cleave cognition at its conceptual joints,” such that each domain is organised by a different “set of entities and abstract principles relating the entities.”
Further, it is not obvious that the world at the spatiotemporal scale that we represent it has a particularly unified structure. In the special sciences, for example, we do not just build one unified model of reality. We triangulate reality with multiple models, each of which is partial, some of which are incommensurable and some of which actively contradict each other. It wouldn’t surprise me if mental modelling were similar.
Ultimately, I’m not sure what to say about any of these issues. They rest on complex questions concerning modularity, cognitive architecture, and how to individuate models. (One of the nice things about concluding chapters is that you’re allowed to confess to conspicuous limits of your knowledge–or at least I hope you are). Nevertheless, I think that these are important questions for future research.
- Explanatory Scope
I have tried to stress throughout this thesis that the theory of mental representation I have defended does not capture all aspects of mental representation. One can think that the mind is a predictive modelling engine without thinking it is only a predictive modelling engine.
I am also aware that without some more precise account of the limitations of this theory, it can seem like something of a moving target. Whenever someone points to some phenomena that it can’t accommodate, I can always just point out that it wasn’t intended to explain everything.
Nevertheless, I don’t really have a precise account of its limitations. Instead, here are three issues that I think pose problems for the account.
First, I haven’t talked much about “higher cognition,” except to gesture towards the applicability of causal generative models across different domains and the capacity for such models to be run “offline.” Nevertheless, it is not clear that this is adequate to explain some of the more rarefied aspects of human thought, reasoning, deliberation, planning, and so on.
To take just one example, I have argued elsewhere that probabilistic graphical models (see post 6) are unable to explain the representational capacities exhibited in human thought. Roughly, any representation in which variables play specific roles within a network structure and world states are represented by a vector of variable values has quite specific expressive limitations. For example, such a network can create an infinite number of examples of models of the relations between certain diseases and symptoms, but they can’t capture the general principle that the arrow of causality always runs from diseases to symptoms.
Further, such networks lack the radical compositionality at the core of human thought. If I say, “The Pope is in a pink dress playing the guitar whilst thinking about donuts,” you instantly understand it, despite the fact that nobody has every uttered or understood this sentence before. The explanation? Thought is compositional. (It relies on combining a finite set of representational primitives to yield an open-ended number of complex representations). It is not easy to make sense of this radical compositionality in the context of graphical models, however.
To get around this, some theorists have argued that the mind’s generative models are better understood in terms of probabilistic programmes, which are more expressive than graphical models. I bet they’re right, but it’s a complex issue for future work.
Second, and relatedly, we can direct our thought towards abstract phenomena such as numbers, theorems, functions, and so on, which don’t play any role within the world’s causal-statistical structure. This obviously poses a problem for the view that mental representation should be understood in terms of causal generative models.
Finally, there are aspects of memory that I haven’t addressed here. For example, we can remember relatively discrete pieces of information (e.g. someone’s phone number) and recall specific experiences (e.g. where you were when you heard the news that Trump was now president). It isn’t straightforward to integrate such phenomena into the account of mental representation that I have defended, however.
Again, this list is not exhaustive, and I don’t want to say that they present insoluble problems for a generative model-based theory of mental representation. They are just phenomena that warrant greater scrutiny in future work.
- Public Representational Systems
My aim in this thesis has been to describe important aspects of mental representation that we share with other mammals (and perhaps some other non-mammalian species with homologues of the neocortex). As such, my focus has been on pre- or sublinguistic mental representation: the capacities for mental representation that we share with many animals that lack a flexible combinatorial symbolic language of the sort that humans command.
Nevertheless, it could hardly be denied that language plays an extremely important—and plausibly unique—role within human cognition, including the representational capacities found in human thought. Given this, how does the causal generative modelling outlined in this thesis interact with the role of public representational technologies such as natural language in human cognition?
Given that it’s the end of the thesis—a place that seems fitting for wild over-the-top speculation—it is tempting (to me at least) to pursue something like the following view.
Perhaps the generative modelling described in this thesis captures the basic kind of mental representation that we share with other mammals. Nevertheless, through some important set of neural and social changes—the relative enlargement of our neocortex (especially frontal cortex), large-scale dynamic social coordination, and the much greater capacity and inclination for cultural learning and thus cumulative cultural evolution—we alone have acquired the ability to flexibly exploit the highly combinatorial symbols of language and the enormous body of culturally acquired information that any such language embodies.
As such, one might view the emergence of language as a magic bullet in this context, providing a way of addressing many of the challenges raised above.
For example, perhaps the domain generality of conceptual thought is less a reflection of the fundamental kind of mental representation inside our brains than an effect of the idiosyncratic contribution of natural language in human cognition with its arbitrary symbols and acquired cultural knowledge.
Likewise, perhaps the radical compositionality of human thought is likewise an artefact of the grammatical structure of natural language, not the basic kind of mental representation realised in the neocortex.
Finally, perhaps the specific kind of thinking identified by folk psychology is really just a matter of silently talking to ourselves (“in the head”) in a learned natural language rather than a mental language of the sort proposed by Fodor.
In other words, perhaps natural language is a remarkable cognitive technology that both augments and transforms human cognition in radical ways but also obscures the basic kind of mental representation that we share with other mammals.
Perhaps. These proposals are as interesting as they are speculative, and might raise more questions than they answer.
Nevertheless, exploring how the deep generative models inside our brains are augmented, transformed, and extended in the unique human context of combinatorial symbol systems, large-scale dynamic social practices of argument and evaluation, and cumulative cultural evolution is, I think, one of the most exciting projects for future research.
That’s that, then. Thank you to anybody who has actually read these things. I am very grateful. Now for the viva…