How to Build a General Intelligence: Reverse Engineering
Authors: Rawlinson and Kowadlo
This is part 2 of our series on how to build an artificial general intelligence (AGI). This article is about what we can learn from reverse-engineering mammalian brains. Part 1 is here.
The next few articles will try to interpret some well-established neuroscience in the context of general intelligence. We’ll ignore anything we believe is unrelated to general intelligence, and we’ll simplify things in ways that will hopefully help us to think about how general intelligence happens in the brain.
It doesn’t matter if we are missing some details, if the overall picture helps us understand the nature of general intelligence. In fact, excluding irrelevant detail will help, as long as we keep all the important bits!
These articles are not peer reviewed. Do assume everything here is speculation, even when linked to a source reference (our interpretation may be skewed). There isn’t space to repeatedly add this caveat throughout these articles.
1. Physical Architecture
First we’ll review and interpret the gross architecture of the brain, focusing on the Thalamo-Cortical system, which we believe is primarily responsible for general intelligence.
The Thalamo-Cortical system comprises a central hub (the Thalamus and Basal Ganglia), surrounded by a thin outer surface (the Cortex). The surface consists of a large number of functionally-equivalent units, called Columns. The Cortex is wrinkled so that it’s possible to pack a large surface area into a small space.
Why are the units called Columns? It’s the physical structure of their connectivity patterns. Cells within each Column are highly interconnected, but connections to cells in other Columns are fewer and less varied. Columns occupy the full thickness of the surface, approximately 6 distinct layers of cells. Since these layers are stacked on top of each other and are loosely connected between stacks, we have the appearance of a surface made of Columns.
Confusingly, there are both Macro and Micro-Columns, and these terms are used inconsistently. In these articles we will simply say ‘Column’ when referring to a Macro-Column as defined in a previous post.
In the previous article we described the ideal general intelligence as a structure made of many identical units that have each learned to play a small part in a distributed system. These theoretical units are analogous to Columns in real brains.
Columns can be imagined to be independent units that interact by exchanging data. However, data travelling between Columns often takes an indirect path, via the central hub.
The hub filters messages passed between Columns. In this way, the filter acts as a central executive that manages the distributed system made up of many Columns.
We believe this is a fundamental aspect of the architecture of general intelligence.
Other brain components, such as the Cerebellum, are essential for effective motor control but maybe not essential for general intelligence. They are not within the scope of this article.
2. Logical Architecture
The Cortex has both a physical structure (a layered surface, partitioned into columns) and a logical structure. The logical structure is a hierarchy – a tree-like structure that describes which columns are connected to each other (see figure 2).
Connections between columns are reciprocal: “Higher” columns receive input from “Lower” columns, and return data to the same columns. This scheme is advantageous: Higher columns have (indirect) input from a wider range of sources; lower columns use the same resources to model more specific input in greater detail. This occurs naturally because each column tries to simplify the data it outputs to higher columns, allowing columns of fixed complexity to manage greater scope in higher levels, as data is incrementally transformed and abstracted.
Only Columns in the lowest levels receive external input and control external outputs (albeit, often indirectly via subcortical structures).
Note that there are not necessarily fewer columns in each hierarchy level; there may be, but this is not essential. However, abstraction increases and scope broadens as we move to higher hierarchy levels.
We can jump between the physical and logical architectures of the Cortex. Moving over the surface implies moving within the hierarchy. It also implies that moving between areas we will observe responses to different subsets of input data. Moving to higher hierarchy levels implies an increase in abstraction. We can observe this effect in human brains, for example by following the flow of information from the processing of raw sensor data to more abstract brain areas that deal with language and understanding (see figure 3).
|Figure 3: Flow of information across the physical human Cortex also represents movement higher or lower in the logical hierarchy (increasing or decreasing abstraction). In fact, we can observe this phenomenon in human studies. Different parts of the hierarchy are specialised to conceptual roles such as understanding what, why and where things are happening. Image source.|
One final point about the logical architecture. The hierarchical structure of the Cortex is mirrored in the central hub, particularly in the Thalamus and Basal Ganglia, where we see the topology of the cortical Columns preserved through indirect pathways via central hub structures (figure 4).
|Figure 4: Data flows between different Columns within the Cortex either directly, or via our conceptual “central hub”. Our hub includes Basal Ganglia such as the Striatum, and the Thalamus. Throughout this journey the topology of the Cortex is preserved. Image source.|
3. Layers and Cells
Each Column has approximately 6 distinct “layers”. Like every biological rule, there are exceptions; but it suffices for the level of detail we require here. The layers are visual artefacts resulting from variations in cell type, morphology, connectivity patterns and therefore function between the layers (figure 5).
|Figure 5: Various stainings showing variation in cell type and morphology between the layers of the Cortex. Image source.|
The Cortex has only 5 functional layers. Structurally, it has 6 gross layers; but one layer is just wires; no computation. In addition, the functional distinction between layers 2 and 3 is uncertain, so we will group them together. This gives us just 4 unique functional layers to explain.
We will use the notation C1 … C6 to refer to the gross layers:
- C1 – just wiring, no computation; not functional
- C2/3 (indistinct)
- C6 (known as the “multiform” layer due to the variety of cell types)
The cortex is made of a veritable menagerie of oddly shaped cells (i.e. Neurons) that are often confined to specific layers (see figure 6). Neurons have a body (soma), dendrites and axons. Dendrites provide input to the cell, and reach out to find that data. Axons transmit the output of the cell to places where it can be intercepted by other cells. Both dendrites and axons have branches.
|Figure 6: Some of the cell types found in different cortical layers. Image source.|
An important feature of the Cortex is the presence of specialized Neuron cells with pyramidal Soma (bodies) (figure 7). Pyramidal cells are predominantly found in C2/3, C5 and C6. They are very large and common cells in these layers.
Pyramidal cells do not behave like classical artificial neurons. We agree with Hawkins’ characterisation of them. Pyramidal cells have two or three dendrite types: Apical (proximal or distal) dendrites and Basal dendrites. Distal Apical dendrites seem to behave like a classical integrate-and-fire neuron in their own right, requiring a complete pattern of input to “fire” a signal to the body of the cell. In consequence, each cell can respond to a number of different input patterns, depending on which apical dendrites become active from their input.
Hawkins suggests that the Basal dendrites provide a sequential or temporal context in which the Pyramidal cell can become active. Output from the cell along its axon branches only occurs if the cell observes particular instantaneous input patterns in a particular historical context of previous Pyramidal cell activity.
Within one layer of a Column, Pyramidal cells exhibit a self-organising property that results in sparse activation. Only a few Pyramidal cells respond to each input stimulus. The Pyramidal cells are powerful pattern and sequence classifiers that also perform a dimensionality-reduction function; when active, the activity of a single Pyramidal cell represents a pattern of input over a period of time.
The training mechanism for sparsity and self-organisation is local inhibition. In addition to Pyramidal cells, most of the other Neurons in the Cortex are so-called “Interneurons” that we believe play a key role in training the Pyramidal cells by implementing a competitive learning process. For example, Interneurons could inhibit Pyramidal cells around an active Pyramidal cell ensuring that the local population of Pyramidal cells responds uniquely to different input.
Unlike Pyramidal cells, which receive input from outside the Column and transmit output outside the Column, Interneurons generally only work within a Column. Since we consider Interneurons play a supporting role to Pyramidal cells, we won’t have much more to say about them.
|Figure 7: A Pyramidal cell as found in the Cortex. Note the Apical and Basal dendrites, hypothesised to recognise simultaneous and historical inputs patterns respectively. The complete Pyramidal cell is then a powerful classifier that when active represents a particular set of input in a specific historical context. Image source.|
Also published on Medium.