In the previous post I outlined research that offers some tentative support to a conception of the neocortex as a predictive modelling engine, constructing hierarchical generative models of those features of the world responsible for generating the sensory data to which it is exposed and then exploiting these models for prediction-based processing. In this post I am going to step back and explore how such ideas both vindicate and deepen the three core insights that I extracted from the work of Kenneth Craik in Chapter (post) 3. These insights, recall, were:
- that mental representation should be understood in terms of models in the brain that share an abstract structure with those aspects of the world that we represent in perception, imagination, and thought.
- that the core function of such models—the important thing that they enable us to do—is prediction.
- that this capacity for predictive modelling engine should ultimately be situated in a cybernetic understanding of brain function, in which we model the world from the perspective of our contingent physiological needs.
I will consider these in turn.
Generative Models as Structural Representations
For Craik, mental representations are models that capitalize on structural similarity to their targets. The idea that generative models within predictive processing in some sense “recapitulate” the causal-statistical structure of the world is common in both the scientific and philosophical literature. What does it mean?
Recall that a generative model can be understood in the most basic sense as a structure that can generate a range of phenomena in a way that is intended to model the process by which those phenomena are actually generated.
The concepts of “structure” and “phenomena” here are quite general. For example, an orrery (i.e. mechanical model of the solar system) can be understood as a kind of generative model of relative planetary positions and motions.
Likewise, the graphics programmes that underlie computer aided design programmes or the design of interactive video games are also generative models. They can be used to dynamically render (i.e. generate) a two-dimensional image from a specification of relevant scene parameters (e.g. the shapes, positions, surfaces, colours, motions, etc. of objects).
(In interactive video games, the player’s actions are also a crucial feature of this generative process—something that will be important in the next post).
One way of understanding vision in terms of generative modelling is that our visual system implements the functional equivalent of a graphics programme in its “top-down” (and lateral) connections. Its task is to reconstruct the incoming image based on a representation of those features of the world responsible for generating it. If predictive coding is correct, the role of the incoming sensory inputs is just to provide feedback on this top-down generative process.
In what sense do generative models in this sense “share an abstract structure with” their targets?
Here is a really simplified way of understanding this idea in the case of vision. The visual inputs we receive—that is, the patterns of light stimulating our retinal cells—are the effects of a generative process in the world. That is, these highly structured, non-random patterns of light are caused by the arrangement of features of the distal world: for example, the shapes, size, textures, positions of objects, the position and intensity of the light source, etc.
This generative process has elements and relations. The elements are those features of the world-sensorium relationship that need to be specified in order to describe this process (e.g. the shapes, size, textures, positions, etc., of objects). The relations are the causal and statistical relations among them. Causation relates both features of the world to each other and those features of the world to patterns of proximal sensory input. Features of the world and the patterns of sensory input we receive are also correlated with (i.e. statistically related to) each other in important ways. (As noted in the previous post, there is an important difference between causal and statistical relations in this sense. I think that our brains clearly model both).
A generative model should have structural elements (e.g. variables) that stand in for the features of this generative process and it should represent the relations between these features in a way that mirrors the actual relations between those features in the world. For example, if feature B is a cause of feature A in the world, this relation should be replicated among the relevant variables in the model. (This is in fact way too simplistic because most causal/statistical relationships exist as parts of highly complex multivariate networks, not simple two-place relationships).
One of the advantages of graphical models described in the previous post is that they provide an intuitive visual illustration of this pattern of causal and statistical relationships among variables standing in for features of the world.
This is the minimal sense in which generative models should be thought of as structural representations, i.e. representations that share an abstract structure with their targets. Of course, there are many qualifications and complications here that I deal with at greater length in the thesis, but the core idea is relatively straightforward: generative models are accurate to the extent that they recapitulate, i.e. mirror, the structure of the actual process responsible for generating the data to which they are exposed.
For more on this topic, see the great work by:
THE PREDICTIVE MIND
The second core insight that I extracted from the work of Craik is that the core function of the models inside our brains is prediction—specifically, a kind of flexible predictive capacity that (according to Craik) imbues intelligence with its “constructive and adaptive significance.”
Theories such as predictive processing both vindicate and deepen this insight, I think. There are many aspects to this.
LEARNING: Craik had no idea how organisms might acquire the models that underlie prediction. If predictive processing and related generative model-based theories of cortical information processing are correct, prediction—specifically, prediction error minimization—underlies the installation of the very bodies of information that underlie prediction.
PERCEPTION: Craik also didn’t think of prediction as being central to perception. Again, though, a generative model-based theory of perception suggests that we perceive the world thanks to our capacity to predict sensory inputs on the basis of models of the world’s causal-statistical structure. It is this predictive capacity that underpins the exploitation of prediction errors to update our perception of the world.
In initial formulations of predictive coding (e.g. Rao and Ballard), the “predictions” here target a static input (they are thus probably better thought of as “reconstructions”). In actual perceptual processing, however, they don’t. As I will argue in the next post, perception is heavily reliant on forward-looking predictions (i.e. predictions about what will happen next), which is also nicely explained in terms of generative modelling.
ACTION: there is an obvious sense in which prediction underlies intentional action. We act based on predictions of the likely outcomes of our actions. Nevertheless, there are purely model-free reinforcement learning algorithms that enable an agent to adapt its behaviour to its environment as a function of past punishments and rewards. Craik—correctly—realised that such model-free reinforcement is grossly inadequate to account for intelligence. He also noted the importance of prediction to more basic forms of action control, however—namely, in enabling the nervous system to overcome signalling delays that would otherwise undermine sensorimotor processing.
This is now an orthodox idea in sensorimotor psychology: namely, the idea that effective action requires a forward (i.e. generative) model for anticipating the likely sensory consequences of the agent’s behaviours.
The idea that predictive modelling of some kind underlies our basic interactions—“everyday coping,” “being in the world,” “basic cognition,” whatever—thus seems plausible, and I will return to this in more depth in the next chapter, when I explain why I think “radical” embodied cognitive science is mistaken.
Nevertheless, one might ask how relevant generative model-based predictive processing is to so-called “higher cognition,” i.e. our capacities to think, reason, plan, etc.
I have argued elsewhere that theories such as predictive coding are in fact unable to explain distinctively human thought. I will return to this issue in the final concluding chapter (post).
In any case, though, I think that it would be a mistake to conclude that generative model-based predictive processing cannot capture any aspects of higher cognition. There are two ways in which a generative model-based understanding of mental representation can be generalised beyond perception, learning, and action control.
First, I noted in the previous post that the concept of a causal generative model applies whenever an agent has access to some data that results from a systematic causal process. Under those conditions, generative model-based predictive processing becomes useful. Crucially, however, the use of causal models applies far beyond the domains of simple perception and sensorimotor control.
For example, an influential view in cognitive psychology is that commonsense reasoning is organised around intuitive theories of domains such as psychology and physics. In other words, our capacity to interact with both our physical worlds and other agents is dependent on idealised causal models of these domains.
One of my favourite examples of this work argues that our brains contain the functional equivalent of a physics engine of the sort used in the design of interactive video games. Physics engines are causal generative models that reconstruct the physically relevant properties of a scene—the mass, shape, etc., of objects, and the forces being deployed on them—in a form that enables one to run quick predictive simulations concerning what would happen under a large variety of possible interventions on the physical world. I will return to this in the next post.
It is constitutive of a generative model that it can be used for generating the range of phenomena in its domain in the absence of externally generated input. This suggests the attractive idea that such models could be used for more radical forms of offline model-based simulation for the purpose of things like reasoning and planning.
The core idea here is that learning a generative model via online interaction with a domain enables one to then exploit that model purely through mental imagery or simulation for various kinds of offline problem-solving.
One of the coolest examples I know of this comes from work in machine learning by Ha and Schmidhuber. They designed an artificial neural network-based agent that could learn to play video games via generative modelling and reinforcement learning. The generative model was used to extract a compact description of the game dynamics in an unsupervised way, which could then be given to a separate controller model designed to maximize reward. (This means that action policies are learned from a compact representation of the game rather than directly from pixel values (i.e. the raw data), which makes it more effective).
Here is what’s cool. Because their agent learns a generative model, once it has learned how to play games, it can be used to simulate such games entirely inside of what the authors call its “dream world,” i.e. hypothetical game scenarios generated without externally provided data. (When I was 13, I was obsessed with a game called Runescape. I was so obsessed that I even used to dream about playing the game. Pathetic, I know. I have a friend who was up until recently extremely obsessed with World of Warcraft. He tells me he would constantly imagine playing the game when he wasn’t playing it, even during sex. !!!).
Here is what’s really cool. Ha and Schmidhuber show that the agent can learn action policies entirely inside of its own dream world, and then transfer these policies back to the actual game environment—where such policies are successful.
For more details on this, see their paper. From my perspective, what is interesting is not the specifics of the application but the more general idea that a predictive model acquired via unsupervised learning in interacting with a domain can be exploited for highly adaptive and purely “offline” problem solving, the fruits of which can then be applied in the real world.
The Brain as a Regulator
The third idea that I extracted from Craik’s work is that we should understand brain function in a cybernetic context. As I noted in the third post, this is the idea that I’m least sure of, but I’ll now take a stab at why it might be a useful perspective to take on mental representation.
When we try to understand the mind, it is easy to get trapped in our myopic, high-level folk psychological understanding of ourselves—an understanding in which we are intentional agents that perceive, think, reason, imagine, dream, and so on.
One of the most important questions in cognitive science and the philosophy of mind is how this high-level folk psychological understanding of ourselves relates to research from “lower-level,” more “fundamental” sciences such as neuroscience and biochemistry.
When people ask this question, they typically have in mind broadly “mechanistic” considerations from such lower-level sciences, i.e. the structure of neural networks, neurotransmitters, etc.
If we step back even further, however, we can also adopt a highly theoretical perspective from which we notice that we—and other organisms—are very strange kinds of physical systems. We are strange in that we are self-organizing: we continually exchange matter and energy with our environments to maintain our structural integrity in the face of a general tendency towards disorder in the universe as described by the second law of thermodynamics.
This is just to say that biological systems are homeostatic systems. The term “homeostasis” comes from the words for “same” and “state.” It refers to the fact that biological systems manage to maintain the stability of their internal states. (E.g. think about maintaining a stable body temperature and fluid balance). From the enormous number of possible states that, say, a rabbit could be in, it manages to occupy an extremely small subset of such states. Achieving this feat of self-organisation in a hostile world is remarkable—something that demands explanation.
What constitutes homeostasis and optimal functioning is different for different biological systems. Nevertheless, the goal of maintaining structural integrity and the constancy of internal states is a unifying task for all biological systems.
(For the best overview of these ideas that I know, see this paper by Andrew Corcoran and Jakob Hohwy).
Cybernetics and Regulation
For cyberneticists such as Craik and Ashby, this process of maintaining homeostasis in the face of external disturbances is what underlies adaptive behaviour. It thus suggests a general job description for the brain: to regulate the organism’s internal milieu.
What does any of this have to do with mental models or psychology more generally? Well, one of the famous ideas in this area is Conant and Ashby’s so-called “good regulator theorem” (building on earlier work by Ashby). This theorem puts forward the idea that regulatory systems must exploit models of those parts of the world they are responsible for regulating. The upshot:
“The theorem has the interesting corollary that the living brain, so far as it is to be successful and efficient as a regulator for survival, must proceed, in learning, by the formation of a model (or models) of its environment.”
I won’t relay the technical details of this theorem here. To get an intuitive feel for it, though, consider that even a maximally simple example of a regulator such as a thermostat exploits a structure preserving-mapping between the height of mercury in the thermometer and ambient air temperature, such that variations in the former are mirrored by variations in the latter.
As I understand it, it is this basic connection between homeostatic regulation and modelling that also lies at the core of Friston’s free energy principle. What is distinctive about Friston’s work, as far as I can tell, is that he glosses homeostasis in broadly information-theoretic terms—namely, as the minimization of surprisal (roughly, the improbability of an outcome relative to a probability distribution). This then enables him to cast the task of maintaining homeostasis in terms of variational Bayes.
Nevertheless, I don’t really understand the free energy principle, and the aspects of the principle that I do understand I think are problematic on both conceptual and empirical grounds. (I will hopefully have time to write on this someday once the PhD is complete).
Further, there is an important sense in which both the good regulator theorem and the free energy principle are limited from my perspective. They are limited because they focus on regulation in general—including everything from thermostats to single-celled organisms to primates like us—and the sense of “modelling” they describe is at best a kind of implicit modelling very different from the explicitly articulated generative models that I have outlined here.
Why mention them, then?
For the following reason: the link between regulation and modelling they highlight suggests that the kinds of complex predictive modelling in organisms like us might ultimately just be a more complex elaboration of the core world-involving strategy required for successful homeostatic regulation. That is, it suggests that the difference between the highly complex, explicitly articulated predictive modelling that I have outlined here and the simple “implicit” modelling done by simpler regulatory systems reflects the complexity of the regulatory task, not a difference in the task itself.
The basic idea, then, would be that even our highly sophisticated psychological capacities ultimately serve a core pragmatic function that we share with all other life-forms.
Is this plausible? I don’t know, really. I make a slightly better case for it in the thesis, but still not a particularly good one. (I also make some connections to concepts like allostasis and autopoiesis).
Why is it interesting? In the context of this thesis, at least, the idea serves several functions:
- It relates to Craik’s views (ok, not really a good reason, but it facilitates a nice thesis structure);
- It relates to the most ambitious formulations of predictive processing;
- It specifies a fundamentally pragmatic function for the mental modelling that I have outlined here, and suggests that we model only those features of the world relevant to our survival—an idea that bears on issues in embodied cognition, which I will return to in Chapter (post) 9.
Before concluding, though, here is one obvious objection that I can imagine to what I have said so far:
- Homeostatic regulation is something that biological systems must do. But it is not all that they do. We are doing much more than simply trying to maintain the stability of our internal states. (One way of understanding this point: we evolved because of our capacity to reproduce, not just survive, which requires active mechanisms that go far beyond mere homeostatic regulation).
I think that this objection is basically right and I haven’t really seen a good response to it in the literature. Oh well.
Because this post is now far too long, here’s a very brief summary:
- Accurate generative models share an abstract structure with the process responsible for generating the data to which they are exposed.
- Such models underlie prediction, which in turn underlies unsupervised learning, perception, sensorimotor control, causal modelling in general, and “offline” simulation in the service of “higher cognition.”
- One can situate this capacity for predictive modelling in the context of the fundamental imperative to self-organize and maintain homeostasis.