HOW THE BRAIN WORKS

CONTENTS:

INTRODUCTION

Introducing the idea that human mentation can be understood in terms of the fundamental operations of a small number of brain regions.

A NEURAL DESCRIPTION OF BRAIN OPERATION

A description of the cortex and how it accomplishes its feature abstraction function, including a categorization of mental functions as pattern formation, matching and induction, operating at different time scales.

LEARNING, CONTRAST, ATTENTION

Hebb-type higher-order correlational learning as the basic cortical process. Neurobiological constraints on the cortical learning algorithm. The hippocampus and episodic memory. An aside on qualia.

FEEDFORWARD AND FEEDBACK

Feedback projections instantiate attention/contrast enhancement. The inferior pulvinar and the shifter hypothesis.

ATTENTION AND THE BASAL GANGLIA

A somewhat speculative theory of how the basal ganglia instantiate distractive and selective attention. Emotion and motivation and how they determine selection of behavior. Details of basal ganglia functioning.

PUTTING IT ALL TOGETHER: COGNITION IN ACTION

A description of what is going on in someone's head while they perform a simple task. Comments on hemispheric lateralization, recall, 7 as the 'magic number', and the 'mind's eye'.



Introduction

This essay is written from the point of view that all mental functions and phenomena are brain processes. The mind is the brain, and as such can be explained independently from any separate 'dimension' of mind.

The brain is a machine inside a machine. Its purpose is to propagate its genes forward in time. To do this it must survive in and interact with the external world. The key to understanding what role different parts of the brain play in this process is realizing that structures that are anatomically and physiologically homogeneous perform the same neural operation throughout the structure. First deducing exactly what each structure's defining operation is allows interpretation of mental functioning in these terms. It provides a framework for neurally defining otherwise abstract concepts such as 'attention' or 'inference', and can show how a set of previously unrelated mental functions are all instantiated by the same brain structure performing the same basic operation.

For example the cortex, pulvinar and basal ganglia all perform their characteristic operation on whatever input they receive. The basic function of the cortex is abstraction, in order to build a model of the world. The basic function of the pulvinar and basal ganglia is to construct a dynamic mapping between active cortical areas (feature spaces). They use this mapping to implement attention, imposing the 'focus of attention' on cortex through a process of contrast enhancement, basically center/surround excitation/inhibition. Lateral inhibition is perhaps the most fundamental operation of brain circuitry, integral to the operation of all three structures.

I am arguing that these principles are sufficient to explain cognition and action, that no further fundamental breakthroughs need to occur in order for us to formulate a correct and useful model of brain functioning. Thus, _the_ function of cortex is to construct a model of the world. It does not 'compute' anything beyond higher and higher order correlations over successively abstracted features. A feature is any space/time invariance in the Universe detectable by us and behaviorally relevant. Features are basically patterns in the world. Information. We may understand exactly how cortex functions without ever 'decoding' these features (like hidden units in an artificial neural network). All we need to know is that the activation of the neurons representing a particular feature indicates that the feature is present in the subject's mind (either present as a real world stimulus or recalled from memory). The significance of the features is not determined by the cortex, each cortical area just abstracts higher level concepts from the lower level features of the previous (input) area. A concept is simply a collection of features. The defining relationships between the features are directly represented by synaptic connections between the neurons representing the features. These synaptic connections represent real relationships between the elements of the real world system being modeled (as proposed by Crick?).

I will finish the introduction with a discussion of 'top-down/bottom-up' processing as it relates to cortical model building. The cortical model is constructed in the following way: Data is taken in at the bottom (primary sensory cortices) and invariances are abstracted at successively higher levels. The particular invariances abstracted at the lowest levels do not change over the lifetime of the individual (eg in vision, spatial edges, velocity and direction of motion of small image components), so can be specified genetically or by prenatal 'bootstrapping'. Thus the model at low levels is constructed (and activated) in a bottom up fashion, constrained (mostly) by data from the world. At higher levels the invariances abstracted are called objects, concepts, motor sequences and thoughts. Abstracted across large enough distances in space and time, these invariances become predictions, plans and inferences. The nature of these invariances (which lower order features they are composed of) determines the behavior of the individual. Thus, there must be a selection process based on behavioral relevance. In the brain this selection process is instantiated by the basal ganglia. The basal ganglia selects the high level plan/prediction that satisfies/will satisfy some behavioral goal, and imposes this decision on the (frontal) cortex. The decision is broadcast to lower levels through cortico-cortical feedback connections as a prediction/sensory state. This top down feedback determines to a high degree the activity of higher level cortex, and also the activity of lower levels to a lesser degree. You literally see and think what you expect to happen, rather than what actually happens, unless reality strongly disagrees with your prediction (you are distracted).

Thus we can see that top down and bottom up determine/influence each other in the loop that is an individual's interaction with the world: Sensory input data activates stored invariances, of higher and higher order as the information moves 'up' cortex. The highest order invariances activated are those that previously produced successful behavior. They are submitted to the basal ganglia which selects a subset (depending on behavioral context) to be implemented as this particular instant's perception/action. The influence of top down selection dissipates as it is fed back 'down' cortex and merges with bottom up data being fed forward 'up' cortex.

Here is a crude summary:

	       'Association' cortex		    Sensory cortex

                   Top down <------------------------> Bottom up
               Behavioral selectivity		   Real world data
               Generality/Abstraction		   Precision
               Subjective			   Objective 

Attention:     Basal ganglia, medial pulvinar	   Inferior pulvinar
(for learning)	       	    Lateral inhibition/Contrast



A neuronal description of brain operation

This starts with an account of what is commonly called 'data processing' from the 'bottom' to the 'top' of cortex.

The cortex is a modular structure, with a basic circuit that is replicated many times. We already know what this basic circuit is, albeit at a very low level of detail. This circuit is seen, with some minor variations, in all mammals from rats to humans, and in all cortical areas from primary sensory cortex to primary motor cortex via frontal 'association' cortex. The same classes of component cells are, as far as we presently know, also seen across species and modalities. This incredible ubiquity means that the basic function of cortex is independent of the demands of any particular animal, and independent of the nature of the data that it is processing: The basic cortical circuit that 'processes information' in the sensory cortex of a cat when its leg is touched is the same circuit that 'processes information' in the frontal cortex of a human when it thinks about a calculus problem. This means that the function that the cortical circuit performs must be very general indeed: Nature has found a brain circuit that can accept information about anything, abstract the relevant regularites and invariances and incorporate them into its structure, then use the resulting structure to very quickly (perhaps purely feedforward) provide relevant/correct output for any input. This is why we do not hesitate when doing things that we have done many times before.

At the lowest level of abstraction/analysis, our information input rate is limited by the spatial and temporal sampling rate of our sensory transducers. We sample at high rates to gather maximum information. This generates a large amount of data, only a small fraction of which is relevant in any one situation. The relevant information must be abstracted. This process is performed by the cortex.

Consider visual information entering the primary visual cortex (V1) from the lateral geniculate nucleus (LGN). Space/time patterns of spikes enter layer 4 of the cortex. Useful information is abstracted across time and space - layer 4 cells fire slower and their receptive fields (RFs) are bigger than their LGN inputs. Layer 4 projects to layer 2/3, where more complex features are abstracted - layer 2/3 cells detect higher order correlations in the firings of layer 4 cells. Because the sensory input contains relatively more information in the spatial domain, temporal resolution is sacrificed to enhance spatial resolution: Cells signalling similar features synchronize their oscillatory firing with cells further away in space (as recorded by Gray and Singer). This occurs with optimal, high-contrast stimuli - stimuli that are conveying a lot of precise information about the external world. Extra information about spatial correlations in the input can thus be sent onto the next level. This implies that spike timing in V1 is not important for intervals smaller than ~10 ms. In primary auditory cortex (A1) precise spike timing is presumably more important, thus gamma-frequency synchronized oscillations are less likely to occur here.

Feedforward input to a cortical area comes from an area lower down in the space of complexity. Each spike in each axon signals the presence of a specific feature detected in the layer below. The patterns (in space and time) in these spikes are the correlations to be abstracted by the cortical area in question. The neurons in one area compete for features. Each neuron has thousands of synapses on it, though only a few tens are needed to cause it to fire. The few lucky synapses are selected by a process of correlational, Hebb-type learning: If the neuron fires it is because it has received a combination of inputs (representing some feature pattern) from a lower area. The synapses from these inputs to the target neuron are strengthened - the neuron now represents that pattern of inputs. If it fires again it is a sign that that pattern is present again.

There are also connections between neurons within an area. These connections also aid in model building. Through the same process of Hebb-type learning, neurons within an area strengthen their connections with neurons that fire with them. Neurons that represent similar features. Thus all neurons in a module respond to similar features, but each responds to a slightly different feature than its neighbors. The outputs of these neurons are in turn competed for by neurons in higher areas of complexity space. Long-range horizontal cortical connections extend the distance in feature space across which features can be correlated.

Neurons with similar selectivity cluster together to compete for the set of features being delivered to their approximate position, producing maps, areas, columns and blobs. Competition between neighboring neurons is implemented by lateral inhibition, and occurs when the stimulus has high contrast (high signal/noise ratio) or equivalently when the area is 'spotlighted' (depolarized) by attention (see later). When the stimulus is low contrast the neurons fire at low rates and little lateral inhibition occurs. The input is averaged across more neurons and little learning occurs.

In this way the dynamics of the cortical circuit constructs high dimensional feature spaces which are mapped onto the 2 spatial dimensions of cortex. The most important feature dimensions (the ones that contain most of the variance of the data) dictate the basic organization (eg orientation maps in V1), with lesser features integrated into the map in an increasingly distorted fashion (a less contiguous mapping). The resulting feature maps are a central component of cortical functioning.

In higher cortical areas, as the signals move higher up in complexity space, more information is extracted and firing rates decrease. At this point, abstracted across sufficiently large distances in (space and) time, rather than remaining as its own representation, time is used as a third representational dimension. Instead of oscillating to signal spatial correlations, the time differences in neuronal firing induced by the input patterns begin to become important. Higher areas begin to model the relationships between high level concepts using temporal intervals. Hence Abeles finds strict time relationships between the firing of specific neurons in higher areas. In addition, because the space/time distances being spanned by very high levels are large, their temporal patterns of activities become predictions. They become plans for the future: A prediction/plan is simply a spatio-temporal sequence of concepts - if the first concept is activated (by a stimulus in the real world) then all the following concepts are activated in order, to produce a prediction of what is going to happen next. The plan is actually implemented by layered feedforward projections to the motor cortex. At each stage, the number of dimensions of the information is reduced until in primary motor cortex the firing of neurons once again represents the physical space/time dimensions. Again, since the resolution of the spatial information required is higher than the temporal, synchronized oscillations are used to signal spatial correlations at the expense of timing.

The representations in higher motor areas are complex motor sequences known as motor set. They are simply predictions/plans in the motor modality. In prefrontal cortex these representations become plans for the individual as a whole. Goldman-Rakic has characterized the activation of prefrontal neurons (in a motor or occulomotor task with spatial and/or temporal delays) as occurring only from 'a representation or concept'. In her view, prefrontal cortex does memory guided performance - responses are driven by internal representations as opposed to 'associative processes', sensory guidance, or reflexes. I interpret this as saying that the representations in prefrontal cortex are more abstracted (across space and time) than in any other cortical area. The highly abstracted representations are ultimately derived from sensory representations or reflexes; prefrontal cortex implements the same process as other cortical areas, but because the input is more abstracted the representations are called plans and predictions.

In higher cortical areas the feature spaces being mapped have a more objective structure, and the representations will be sparser within that space. This is a consequence of the fact that the dynamics of higher cortical modules directly reflect the dynamics of the real world subsystem being modeled - more complex (abstracted) systems have more rigidly defined components and relationships between them. This structure aids in recognition (cf the viewpoint consistency constraint in vision) and in inference of unknown conceptual structure: Higher cortical areas such as prefrontal cortex evolved to make plans and predictions in the 4 physical space/time dimensions, but this can happen in any constrained feature space (eg the conceptual space of 'thought'). The rigid but sparse structure of higher order feature spaces allows the induction of new patterns by activating a group of nearby patterns (nearby in physical space on the cortex, therefore nearby in that feature space). Because of the rigid/sparse structure and the fact that the dynamics of the existing patterns map the dynamics of the real world system being modeled, the new induced pattern has a good chance of corresponding to a real concept in the world. This is the basis of creativity. Creativity is a process of discovery in the information space of the Universe.

The following tables summarize the equivalence of a number of mental functions, explaining them as the the processes of pattern formation, matching and induction, operating at different time scales.

Level      'Processing'            Learning              rate  Predicting    

sensory cx Recog. spatial pattern  (data) new pattern    slow  Hallucination
assoc. cx  Recog/regen s/t pattern 'fit' new s/t pattern fast  Induction


	  Input			   Result
V1/M1	  data (world)		   perceptions/reflexes		detail
Pre motor sensory guidance	   motor sequences
Pre fr.   Internal representation  plans			context


		Highest representation
motor cx   	planning, motor set/sequence		Action
sensory cx	complex objects				Perception
prefrontal cx 	prediction of future/inference		Thought


In sensory cortices (eg visual cortex) primarily spatial firing patterns are stored as synaptic weight changes and later examples are recognised in the process of perception. Prediction of new patterns (not previously encountered or currently present) does not occur in visual cortex (except pathologically as hallucinations) because at this low level pattern structures are constrained by data.

At higher cortical levels precise temporal coding is used to produce complex spatio-temporal patterns. Thus association cortex recognises and regenerates new s/t patterns during perception and recall. New input patterns 'fit' with existing patterns because at this level the cortical model begins to have some objective structure, some constraints. This is the process of understanding. As described previously, learned high order spatio-temporal patterns are predictions: When the pattern is activated it goes through its sequence, predicting which features will occur (and when they will occur). This applies in all modalities, though it is most intuitive when considering the motor modality. Prediction of novel patterns (induction) can happen in higher cortical areas that map highly constrained, sparse feature spaces.


Learning, contrast, attention

Learning is repeated experience, modulated by attention and weighted by emotional relevance. The structure of the feature maps constructed in each cortical area is determined by the relative frequency of activation of each feature. In lower areas the features are so basic (eg oriented/moving edges) that they all have the same relative frequency of occurrence, thus primary sensory feature maps do not change very quickly. The features and concepts represented in higher areas change more rapidly, and are selected based on behavioral relevance. Thus the feature maps of higher cortical areas shift more quickly than those of lower areas. This has been demonstrated in a number of experiments - repeated sensory stimulation causes a change in the cortical feature map; more neurons are assigned to represent the stimulus, its representation on the cortical surface expands. This change occurs in adults, and occurs faster in higher cortical areas. Thus, we can conceive of the cortex as a continually shifting model of the world, a map of reality constantly morphing over the surface of the brain on a scale of mm over the lifetime of an individual and cm over the lifetime of a species.

Hebb-type correlational learning is the basic cortical process: The more two neurons fire together the stronger the synapses between them become. Thus repeated presentations of the same stimulus will lead to the formation of a representation of that stimulus. Slight inter-trial differences will be incorporated into the representation to produce stereotypes - memory efficient representations of a generic stimulus. Attention acts to speed up the rate of memory formation by 'priming' particular cortical columns (basically a process of depolarization), increasing their firing rate to the expected stimulus. Higher firing rate = faster learning.

Higher cortical areas abstract correlations over large space-time distances. This means they form connections representing relationships between lower order features, connections that are not present in the lower areas. I propose that these correlations are 'sent back down' the cortex - this is a matter of creating connections between the existing less complex features in the level below to reflect the relationships abstracted at the higher level. This is a process of redundancy reduction - the information is implicitly present in the lower areas, it just needs to be explicitly represented as connections between the lower order features.

I believe this occurs to some degree during waking but mostly during sleep. The cortex operates in some way analogous to a neural network called a 'Helmholtz machine'. This network has multiple layers of units, each sending feedforward projections to the layer above and in turn receiving feedback connections from that layer. In training the network receives input patterns and changes the synaptic weights of the feedback connections. In 'sleep' the input connections are turned off and random activation of the feedback connections is used to change the weights of the feedforward synapses. The sleep process can be viewed as a form of redundancy reduction. I propose cortical networks operate according to the same principle - the 'learning' during random feedback activation during sleep acts to reduce information redundancy such that higher level relationships between features/concepts (essentially connections between neurons) are reduced to relationships between the component features at lower levels. In this way the information defining high level concepts actually moves back down cortex towards lower level sensory areas. Since the lower levels change slower, it takes longer to move information to the lower levels, but when there it lasts for a long time (consolidation of long term memory). Thus in an adult concepts/features at many levels of complexity are stored at many cortical levels. This explains why LTP can be quickly induced in lower cortices - sometimes we discover that one of our (literally) deepest and longest held assumptions is wrong.

Recent experiments in behaving rats show that the spatio-temporal firing patterns (as seen between two cells) learned by hippocampal neurons during task performance repeatedly reoccur during REM sleep. I propose that this is an example of the process described above.

The cortical architecture is significantly different from that of the Helmholtz machine. Perhaps the biggest difference is that at each level, instead of a single layer of units that receives both top down and bottom up input, each cortical level is made up of columns of cells. Each column shows characteristic changes from top to bottom: The feedback input enters at the top (mostly layer 1). Layer 2 cells receive most of their input from this feedback projection. The upper layers of the column contain cells that are connected at a density of about 10%, synaptic weights are generally small and inhibition (at least the slow GABAb inhibition) is relatively strong. Cells in the upper layers fire trains of spikes that adapt (decrease in firing frequency). As we move down the column to layer 5, the density of connectivity decreases to about 1%, synaptic strengths becomes larger, inhibition decreases and the cells fire in discrete bursts rather than continuous spike trains. The feedforward input to the column goes mostly to the upper layers, with small amounts to the lower layers. Overall, the cortical column seems to be performing some type of annealing function (cell output functions change from sigmoids to steps), merging the feedback with the feedforward input in the upper layers to create discrete outputs in the sparsely (but strongly) connected output layer (layer 5).

Another important point is that many (~30%) cortical projection neurons (pyramidal cells) do not project out of their local area (into the white matter). Thus these cells are part of the local representation but do not signal any features since they do not project to the other levels. Perhaps they 'stabilize' the patterns in some way.

The highest cortical level in many mammals is the hippocampus, since prefrontal cortex is only extensively developed in higher mammals such as primates. The hippocampus receives input from all of the sensory cortices via the sensory association cortices - a highly abstracted 'state of reality'. A general idea of what is going on. The number of cortical layers (and hence the resolution of representation) decreases from 6 to 5 to 3 as information is abstracted from primary sensory areas to the hippocampus, where a single layer of cells suffices to provide the synapses necessary to instantiate the highest level conceptual representation that lower animals have. The hippocampus is presently seen as a 'long-term memory consolidator', but I consider it as the site of the highest level 'thoughts' that lower animals have. 'Place cells', cells in the hippocampus that fire when the animal is in a particular place in its environment, are literally that: Although we can't read the code of synaptic weights that determine the higher level features the individual cells are coding for, we can read the lower level representation of the population as a whole. The cells are saying 'I am here'. In general, as postulated by Francis Crick, the hippocampus is responsible for the creation of episodic memories: Abstracted representations of the state of the whole organism are generated in the hippocampus. These representations are constantly changing over hours or even minutes, as the state of the organism in its environment changes. The strongest, most repeated representations are fed back into lower cortical areas as episodic memories, as described above. In this way continually repeated experiences and situations are stored as long term memories. When a strong emotional activation occurs, the current representation in the hippocampus is rapidly strengthened (through strong links to the limbic system). In this way particularly meaningful situations are stored as episodic memories. People without a hippocampus (eg patient HM) lose the ability to form these highly abstracted representations, thus they lose the ability to form new episodic long term memories.

The firing of neurons in the hippocampus are the 'thoughts' of lower animals - their higher mental state. Consciousness. In humans there are many more areas that contribute to consciousness, and different areas provide different aspects of consciousness. The impression we have of a unitary conscious mind is an illusion. There is no one place in your brain where everything comes together to produce 'the mind'. It is not possible to destroy consciousness by removing any one area of cortex. Thus, for example, the posterior parietal cortex is our conscious representation of our immediate surroundings - our spatial map of the environment. Activity of a set of neurons in this part of a subject's brain signifies the existence of an entity in the real world with a particular positional relationship to the subject. Obviously this representation is heavily used by the motor system to plan movements. As discussed above, agranular prefrontal cortex contains representations that are plans of action for the individual. Activity in this cortex is our conscious representation of the current plan that we are following. These two cortical areas (posterior parietal (PP) and agranular prefrontal (pre.f.)) reciprocally, topographically project to each other and over 15 other high level cortical areas, including higher visual, premotor, parahippocampal, and limbic cortices. The resulting interconnected loop is literally the loop of consciousness. If any one element is removed then that aspect of consciousness is removed while the others are preserved: Lesions to the right PP cause loss of perception of personal space on the left side. Patients can even deny that their left arm belongs to them. They don't perceive or interact with anything to the left of their midline. People with prefrontal lesions lose the ability to plan over significant distances in space/time. etc etc.

An aside on qualia.

What about qualia? What is subjective experience? A subjective experience is the meaning of a concept or feature to the subject. I am claiming that subjective experiences are the firing of neurons as experienced by the subject. The meaning of a particular neuronal firing is specified by the invariance that caused it to fire (eg the color red) and the subsequent activations that this neuronal firing causes 'higher up'. The meaning of the firing of any neuron in the brain is defined by its inputs and outputs - what it is responding to and what it causes to happen. There is no intrinsic meaning.

So consciousness is the firing of neurons. If a particular cortical region is strongly activated the subject is more conscious of those events. Which neurons? Every spike in the brain doesn't contribute to consciousness, which ones do? The answer is that consciousness, even though its objective instantiation is neuronal spiking, is not a binary phenomenon. We are more or less conscious of the firings of particular neurons depending on the meaning of the firings to us, as individuals. Any learned neuronal spatio-temporal firing pattern is determined partly by bottom-up data and partly by top-down selection. The degree to which we are conscious of particular neuronal firings is determined by how much the representation was formed by top-dowm selection, as opposed to determined by bottom-up data. Firing patterns completely determined by data (eg in V1) are not meaningful to the individual, they have not been selected. However, firing patterns in higher cortical areas are directly selected (by attention) - they have the most meaning to the individual. The individual is most conscious of their activity.


Feedforward and feedback

Projections 'up' cortex are feedforward. They send information from the world to higher levels to be further abstracted. The projections arise from layers 3 and/or 5 of the lower area and terminate in the middle layers of the higher area. Feedback projections go in the opposite direction, arising from lower layers and terminating in outer layers (1+6). What about projections between the highest levels? Here the assignation of feedforward vs. feedback must be done based on the role of each area in relation to its inputs and outputs. The projections from PP and pre.f. cortices interdigitate as separate columns in the cingulate and parahippocampal cortices. Perhaps these are feedforward projections carrying two alternative representations of the global state of the organism (cf occular dominance columns in V1). In contrast PP and pre.f. cortices send completely overlapping projections to the opercular and STS cortices. Perhaps these are feedback connections. But this issue is just starting to be addressed and is far from clear.

What is clear is the role of feedback projections from higher cortical areas through high level sensory cortex to primary sensory cortex. These connections instantiate attention/contrast enhancement, sending a prediction signal to higher areas and imposing a bias on perception in the lower areas. The difference comes from the greater temporal abstraction of the representations in higher areas - the activity in lower sensory cortices just represents what is happening now. In general each cortical layer projects back to the layer it receives input from. If a high level neuron is active, it activates the area of feature space that usually activates it - it is making a prediction about what features are/will be present in the real world. At least in sensory cortex, feedback connections are generally diffuse - they enter cortical layer 1 and synapse over a wide area. This is because the higher level neuron only activates the general area of feature space that it receives its inputs from - the prediction is not completely precise. This feedback information is received by wide-spread apical dendrites of neurons in cortical layers 2/3 and 5. Apical dendritic synapses are far from the cell body. The apical inputs are summed up and 'fired' down the apical dendrite by active conductances. Thus the prediction is not precise in time as well as space. In contrast, synapses within and into layers are made precisely, close to the cell body. The cell can tell which particular features caused it to fire. Thus feedforward information (from the real world) is more precise than feedback (brain's internal prediction). For any active region of cortical physical/feature space, precise input from the real world is superimposed on a background of prediction. The repeated presentation of the same input pattern will change the synaptic structure to reflect its component features. Note that any area receiving input without concurrent feedback will have a hard time activating cells and being learned. The input will have to be strong and repeated. This means that the brain has a hard time accepting things it does not believe in/predict.

As described above, the representations in the primary sensory cortices are formed largely independent of behavioral selection, instead they are determined by the statistics of the real world input. Thus, at this stage the feedback that is attention in higher areas must be described in terms of properties of the input. The neurally equivalent process to attention in lower cortex is contrast enhancement, increasing the firing of maximally active cells and decreasing the firing of surrounding cells. If a V1 cell is firing at a high rate, it is being stimulated by its best stimulus. It is also providing the most information about that stimulus, because cells only fire maximally to high contrast stimuli. A high contrast stimulus is one in which the signal/noise ratio is high: A black edge next to a white edge is high contrast while a dark grey edge next to a light grey edge is low contrast. Increasing stimulus contrast increases cell firing rate in a feature-independent manner - it just multiplies the cell's tuning curve. From a consideration of signal detection theory, we can see that this acts to lower the threshold of discriminability along the dimensions that the neuron is signalling. Higher contrast leads to higher precision - better discriminabiltity among features and concepts. Thus, in order to maximise the rate of information signalled about the stimulus, cortical circuitry increases the contrast that this neuron receives - because the neuron's firing is caused by the presence of its optimal features, attention at this level reduces to further enhancing its firing relative to other neurons at the same level. This is done by lateral inhibition. Lateral inhibition operates within a cortical column, but it is the connections between cortical levels that I will focus on now. Feedback connections depolarize the neurons in one area of feature space and hyperpolarize neighboring regions, shrinking receptive fields, increasing contrast. This process continues all the way to the lgn, where it operates with the least precision (over slow time scales and large areas) - the feedback of layer 6 cells in V1 excites (with a long time constant - metabotropic receptors) lines of lgn cells (alligned with the layer 6 RFs?) and inhibits surrounding lgn cells via more diffuse projections to the inhibitory reticular nucleus of the thalamus. The diameters of the layer 6 axons (therefore their conduction velocities) vary widely so that the contrast enhancement is temporally blurred, to match its broad spatial extent. This fits with the only know operation of lgn RFs - contrast enhancement. At low cortical levels, then, because neurons close in feature space are close in physical space and because their firing is dependent on the most informative features in the input (specified genetically, not behaviorally selected), feedback connections instantiate attentional processes. Attention and contrast enhancement are the same neural process, basically lateral inhibition.

This process is coordinated by the inferior pulvinar, part of the thalamus. It reciprocally projects to most visual cortical areas. Van Essen's group has recently proposed a model based on the shifter circuit hypothesis that assigns a similar role to the pulvinar. Their's is a model of attention in which higher cortical areas 'zoom in' on features of interest, creating a higher resolution 'window' of the target. The pulvinar acts to dynamically change visual cortical synaptic strengths to 'route' information from a selected area of the visual image up to IT. One prediction of their model is that the feature space (therefore 2D map) of IT should change on the time scale of attention (seconds). Instead I propose that the pulvinar simply coordinates the contrast enhancement of a region in feature space in which the neurons are already firing strongly. Since the pulvinar projections are excitatory, one obvious mechanism is lateral inhibition as produced by layer 6 cells in the lgn (discussed above). In essence, I am claiming that the pulvinar assists in changing the dynamics of visual cortex, whereas they claim that the pulvinar changes the synaptic weight structure. Some top-down attentional effects may filter down to visual cortex (the lowest level affected is controversial) through cortical feedback connections and also through a parietal projection to the pulvinar, but there is a definite data constrained bottom-up information stream that merges into a behaviorally constrained top-down information stream.

The Van Essen model predicts that lesions of the pulvinar should affect visual pattern recognition abilities (besides attention), which has been shown not to be the case in a number of studies. Instead my theory predicts that pulvinar lesions should significantly slow the rate of learning of new patterns, and impair performance on 'higher' tasks that depend on the use of (complex) visual patterns, because the pulvinar maximizes visual information flow to all higher areas.


Attention and the basal ganglia

If the contrast-enhancing network of lower cortices coordinated by the inferior pulvinar corresponds to the posterior attentional system of Posner, what corresponds to the anterior system? The anterior attentional system selectively implements behavior patterns based on the current context. Simply put, if a behavior performed by the cortex benefits (or will benefit) the animal (stimulates the reward center), this system selects that behavior and its consequent use strengthens the cortical synapses that result in the behavior.

The ultimate determination of behavioral relevance is made by the basal ganglia (BG), since they integrate total cortical output and then feedback to assign the correct 'plan of action/perception' on the frontal cortex. The circuitry of the BG is structured to preserve the topography of its frontal cortical inputs and reactivate (via the topographically organized thalamus) the same area that it receives input from - a (switchable) positive feedback loop. The striatum of the BG appears to be a lateral inhibitory network designed for filtering out all but one input pattern, a function that is compromised in Huntinton's chorea, a disease characterized by the nonselective expression of numerous motor actions and thoughts. A projection from the dopaminergic midbrain perhaps provides a lower brain 'go' signal. This is compromised in Parkinson's disease, characterized by an inability to initiate (and terminate?) thoughts and movements. Mark Laubach (an active researcher in the BG system) views the "activation of a spiny cell [in the striatum of the BG] as indicating a coincident activation of a collection of cortical cells distributed in different locations in cortex, and activations at different levels of the striatum as representing unique info occurring in relation to a common "event" (e.g., a behavior, thought)". He thinks that the BG 'set the occasion for behavior', determining the behavioral context in which any one action is implemented - "Neurons in BG only respond to things motor or sensory _in the context of some task_ (in the context of some structure)". I interpret this as selecting an appropriate state from the many that are sent from cortex and imposing this state on top-level cortex (as a motor plan or new thought process, with its attendant perceptual state). This is the process of selective attention. How is this done?

It is done using the same principles and circuitry that implement distractive attention, which I shall describe first. When attention is fixed, when we concentrate, top-down predictions are sent via cortical feedback connections to 'prime' (essentially depolarize) lower areas of cortex, as described above. This makes it harder, but not impossible, to see something we don't expect. If activity 'breaks through' in a different, unpredicted cortical area, it is detected and attention is reassigned to the distracting stimulus. This happens by a process of mismatch detection in the striatum. The striatum of the BG is divided into functional compartments called patch and matrix. Quoting Laubach again : "Different receptor systems in the two compartments; maybe different regulation of a common neural circumstance (in the sense of a coordinated pattern of firing)". Functionally related cortical areas project to the same/overlapping regions in the striatum. I propose that the BG learns the mapping of activity of different cortical areas (although not what is being mapped) - eg if one particular area in cortex is active, which other functionally related areas will be active. Also, the 'higher' cortical areas project to the patches and the lower cortical areas to the matrix. Thus, if a higher cortical area is activated, it sends feedback to a particular part of the lower area that it receives input from, predicting/assigning activity there as discussed above, and a copy of this activity to a patch in the striatum. If, however, strong activity occurs in a different part of the lower area due to an unexpected stimulus, a mismatch occurs in the striatum and is detected perhaps by the cholinergic cells that seem to be the only neuronal elements that span across patch/matrix boundaries:

			 happening
Cortex		 V2 cx ----------->V3 cx
		  |		     |
		  |happening	     |should be happening
		  |		     |
BG		  |	mismatch     |
Striatum	matrix	<----->    patch		

As long as no mismatch is detected, the striatal input coming from nonfrontal cortex does not affect the output of the BG. The detection of a mismatch results in the activation of a pattern more ventral in the striatum (perhaps mediated by the excitatory interneurons in the striatum), a stereotyped pattern that produces a modality-specific 'orienting response' to the distractor when it activates appropriate parts of the frontal cortex. There is no input to the BG from primary visual cortex. I think this is because the visual topography of V1 is so detailed (small RFs) that a sufficiently accurate top-down prediction of future activity is impossible, hence no effective mismatch operation on the output of V1 can be done. Higher feature spaces are more independent of the physical location of stimuli and so do not have this problem.

Emotion and motivation

Now I will describe how selective attention, what we call free will, is implemented using the same circuitry. First I need to explain where the selection signal comes from. The selection signal is affect - goodness or badness of various behaviors. Emotional response. The circuitry of emotion has been well described by Damasio: Emotions are generated by the activation of lower brain structures such as the hypothalamus, aminergic brain stem nuclei and the amygdala. For instance, the amygdala sends projections to all the brainstem centers that directly cause fear behavior (sweating, increased heart rate etc) as well as to the anterior cingulate cortex (cing.). The activity patterns of these neurons are learned in the cing., just as any input pattern to cortex. However, rather than being interpreted as a visual or auditory stimulus, this activity, perhaps because it is correlated with feedback signals to other cortical areas that map the sensory representations of sweating and increased heart rate etc (the cing. receives input from these cortical areas too), is interpreted as the feeling of fear. The firing of these neurons is your subjective experience of fear. This and the other basic emotions and feelings (anger, hunger etc), called primary emotions by Damasio, are all correlated with behavior and perceptual state in the cing., which is a node in the 'loop of consciousness' described above - it receives the highest level (most abstracted) information from other cortical areas. The anterior cingulate cortex projects to higher order limbic cortex (orbitofrontal, prelimbic). This prelimbic cortex also receives input from other prefrontal cortices, cortices representing the plans and overall behavior patterns of the whole organism. The resultant correlation of the emotional input with the top-level plans in the prelimbic cortex produces the feature space that we call personality. The features and concepts in this feature space are called secondary emotions by Damasio. Higher order correlations combine many emotions and behaviors in a way that is not always constrained to satisfy the original drives.

How does the system operate? In two ways.

1) Short term: Because we are physiological beings, we have basic motivations that demand satisfaction (eating, drinking, sleeping, sex etc). Each behavior can either satisfy or not satisfy one of these drives. When ongoing behavior/perceptual state enters the cing. as input, the correlated emotional response is elicited through direct projections to the appropriate lower brain structures. If the response is 'good' (net positive affect) then the lower brain structures send a signal to the substantia nigra compact (SNc) via the raphe nuclei. The dopaminergic input from the SNc to the striatum determines whether the current behavior is let through the lateral inhibitory network of the striatum or whether a different pattern must be selected.

2) Long term: Emotions are produced in response to behaviors that impact currently active drives. Due to the abstracted representations correlated with complex behaviors found in the prelimbic cortex, emotions can also be elicited from a prediction of the effect of a particular behavior on active drives: Each cortical plan/prediction of behavior (from prefrontal cortex) enters prelimbic cortex. Prelimbic cortex does not project to premotor areas - there is no implementation of personal goals this way. Instead, the possible plans that enter prelimbic cortex from neighboring prefrontal cortex are instantiated as spatio/temporal patterns that produce output from prelimbic networks - the emotional reaction to each plan, or perhaps a modification of the plan based on previous emotional responses to the same/similar plans. The output of the prefrontal cortex is what is going to happen, the output of the prelimbic cortex is what should happen (as defined by the goals of the organism). Both areas project to overlapping regions in the ventral striatum. In the ventral striatum patches are innervated by prelimbic cortex but avoided by other prefrontal cortices. Instead these cortices (planning/premotor) project to the matrix. Thus, as outlined above in the discussion of distractive attention, the ventral striatum detects mismatches between prelimbic and other prefrontal output. Mismatches between what will happen and what should happen. If a mismatch occurs then the current behavior/plan is inhibited and a new one is selected. This is done by inhibiting the currently active striatal cells and activating others, perhaps nearby, such that a different region of prefrontal feature space will be depolarized to activate a new plan.

 
		will happen  secondary emotions		primary emotions
CX	Pref cx ----------->Orbf <---------------->cingulate cx
	  |		     |				|
	  |will happen	     |should happen		|
	  |		     |				|
BG	  |	mismatch     |				|
Str	matrix	<----->    patch		hypothal, aminergic, amygd.

Perhaps these two alternative pathways correspond to the two ways we have of making decisions - emotionally reacting to events or 'rationally' (as Damasio has noted this pathway is not independent of emotion and motivation) following a plan:

			       want            new plan
Higher reasoning: prelimbic cx -----> striatum -------> pref cx
		        ^				 |
		        |		will happen	 |
		        ----------------------------------

New plans are produced by depolarization of a prefrontal cortical area to activate a (possibly related) plan that might accomplish the goal.

Emotional response: lower brain E+M ---> Raphe ---> DA ----> striatum
		       ^					|
		       |					|
              cingulate cx <-- pref cx <-- allow same or select different plan

New behaviors are produced by the action of dopaminergic neurons on the striatal cells responding to cortical inputs - inhibition of currently active striatal cells.

Details of basal ganglia functioning.

There is a lot more to the circuitry of the BG than the striatum and substantia nigra. Here is a brief overview of how some of the rest of the circuitry fits in: The prefrontal cortical area that is most strongly active (current plan/behavior) activates inhibitory striatal neurons that inhibit their neighbors. The active neurons inhibit inhibitory cells in the globus pallidus (internal segment (GPi), also substantia nigra reticulata (SNr) neurons). This inhibition causes disinhibition of neurons in the ventral thalamus. Due to the topographic nature of all these projections, the thalamic neurons further activate the already activated prefrontal area (this process is subject to selection, see above). An additional projection from the striatum goes to the external segment of the globus pallidus (GPe), known as the the indirect pathway. The inhibitory neurons of the GPe inhibit the subthalamic nucleus (STN). The STN sends excitatory projects to both the GPe and GPi, and also receives excitatory input from the motor cortex. This arrangement, as discussed by Berns and Sejnowski, creates a situation in which a single pattern of activity 'gets through', inhibiting specific GPi neurons that then disinhibit specific ventral thalamic cells. Subsequent inhibition of GPe neurons through the indirect pathway causes disinhibition of STN cells, which leads to depolarization of GP neurons due to activity of the STN cells (which are also depolarized by motor cortical cells implementing the current plan). This depolarization of GP neurons 'shuts the gate' and prevents disinhibition of any more thalamic cells. Damage to these thalamic cells does not prevent movement, rather it impairs the conscious control of the subject over the choice of movement. From the above, it would be expected that this damage would render the subject insensitive to behavioral relevance when selecting movements.


Putting it all together: Cognition in action

A man walks into a room, goes toward a table in the center of the room, picks up a book from the table and begins to read. He reads something that excites him, so much so that he drops the book and leaves the room.

What is happening in his brain during this episode?

As the man enters the room, a specific sequence of neuronal firings is repeating itself in his prefrontal cortex. This spatio-temporal firing pattern is the plan he is currently following; read the book on the table, because he has been told it contains information that he can use to his advantage. His BG is acting as a maintaining feedback loop; the firing from the active area in prefrontal cortex excites a specific region in the matrix of the striatum which matches corresponding activity in the associated striatal patch, and inhibits surrounding areas. Because the activity represents a plan that is predicting a positive outcome, it passes through the rest of the BG and is fed back to its originating prefrontal area through the thalamus. His mind is focused on getting and reading the book.

He has been in the room many times before, so as he moves towards the table, top-down activity from frontal cortex predicts the visual environment by depolarizing specific areas of visual cortex that will be activated by bottom-up input. Similarly, specific areas of somatosensory and auditory cortex are top-down 'primed' so that when they are activated by the bottom-up input resulting from the feel of his body and clothes as he walks, and the sounds of same, no mismatch will be generated in his BG to distract him away from his task. In response to the same top-down activation his motor cortex is activating patterns that produce the walking behavior, directed towards the table. His attention and consequently his eyes are focused on the table, thus he is 'highly' conscious only of the table and what is on it. The contents of highest consciousness are the most active neurons in the 'loop of consciousness' described above - the pattern in prefrontal cortex corresponding to the current plan, the pattern in anterior cingulate cortex corresponding to the emotional reaction to the present situation, the pattern in higher visual cortex (A19) corresponding to the visual image of the table, the pattern in the right posterior parietal cortex corresponding to the spatial organization of the room, particularly the table and what is arranged on it. I say the right hemisphere because this hemisphere is more concerned with subjective, global properties, while the left hemisphere is concerned with objective details. Thus, the left PP contains the same high-level spatial representations as the right, they are just not integrated into a whole. The visual association cortex at the occipital/temporal border of the right hemisphere responds to global features of images, the corresponding cortex of the left hemisphere responds to details in the image. The left hemisphere contains areas specialized for the understanding and output of language, probably because these processes involve detailed symbolic mappings, while the right hemisphere is responsible for the affective component (feelings) in language.

On reaching the table, the culmination of the current plan is the motor acts of picking up the book, opening it and beginning to read. Perhaps a new pref. pattern is now activated, corresponding to 'reading'. During reading, attention, as well as the eyes, are focused on words on the page. Attention can be considered as the process of getting as much of the most important data where it needs to go as quickly as possible. It works by priming (depolarizing) particular cortical areas. This depolarization acts in the same way that increasing the contrast of the bottom-up input works - it decreases the firing latency of cortical cells, particularly the ones that are most activated by the input, the ones that match the features of the input best. Contrast enhancement/attentional priming also shifts cortex circuitry from an averaging mode, where cells responsive to similar features excite each other to overcome low signal/noise ratios in the input, to a competitive mode where the most active neuronal firing pattern inhibits neighboring cells that represent similar features. This process decreases the time needed to resolve the particular features present in the input, a type of winner-take-all operation opposite in sign to the operation occurring in the striatum of the BG but utilising the same neuronal principle; basically lateral inhibition.

The process of reading is the multi-level process of pattern matching described above. Elementary features (letters and words) activate specific neuronal elements in visual cortex, tuned to respond to precisely these stimuli. The correlated activity of different groups of these neurons in turn activate cells at higher levels, cells that represent the concepts that the groups of letters and words stand for. The process of understanding is neurally the same process as perception - matching input patterns to already existing patterns in each cortical area. At higher levels, this process involves the frequent use of recall, because at higher levels much of the information is implicit, not explicitly contained in the text (in contrast the text explicitly contains all the information needed for pattern recognition at the lower levels - the shapes of each letter and word). Recall can be considered as more precise top-down attentional activation. Perhaps feedback connections at higher cortical levels make more precise connections onto their lower level targets. We know that, at least at lower levels, feedback connections are not as precise as the feedforward inputs simply because recalled (eg visual) images are less precise than real visual images. At higher levels, each concept activated by a set of words in the text can cause the activation of many lower level features and combinations of features in lower level cortices, in a number of different modalities. The extent to which this recall process operates is constrained by top-down activity, ultimately from prefrontal cortex: If no match can be found at the conceptual level to the patterns coming up from below (the person doesn't understand that bit of the text), the activity in prefrontal cortex changes (perhaps directly as a result of feedforward input, perhaps due to a mismatch detection in the BG between top-down predictive activity and bottom-up input patterns that cannot be recognized (understood)). The new active pref. pattern causes more extensive recall to occur; many more related concepts and features are activated in an attempt to activate a set of patterns that will match the input patterns - the person thinks about the text, trying to understand it.

Each concept activated in a particular cortical area is a spatio-temporal firing sequence in a set of neurons in that area. The attentional process facilitates the activity of this set of neurons and depresses the activity of immediately neighboring cells. If the subject thinks about more than one concept at the same level, more than one local zone of facilitated (depolarized) activity must be maintained within the same cortical area. There is a limit to the number of concepts that can be maintained simultaneously at the same (conceptual) level. That limit is 7+-2. People can only keep about 7 items/concepts 'in mind' (STM) at once. I propose that this limit is the maximum number of discrete depolarized zones that can be maintained within a single cortical area. The limit can only be increased by 'chunking' the items - grouping them together as higher order concepts or sequences. Obviously this chunking procedure corresponds to forming a new concept from lower-order features - moving the representation up to a higher cortical level. This higher level will have the same restriction; only 7 higher-order chunked concepts can be kept in mind at once.

As the man continues to read, a host of previously-existing concepts is activated by the concepts contained in the text. Some of the concepts in the text are new. If they relate in some way to the preexisting concepts in his brain, if they have features in common, they can be incorporated into his existing conceptual schema as new spatio-temporal firing patterns. If they do not relate in any way to existing concepts, they cannot be understood. In general, the best way to learn is to experience new things that are different, but not too different, from what you know already. Repeated practice with progressively increasing difficulty (novelty) produces the fastest learning in all modalities. Beyond the learning of new patterns, it is possible to induce new patterns, as described above. This induction is a process of pattern discovery in the relevant (objectively existing) information space. New pattern induction occurs when a number of similar (therefore close in space) patterns are simultaneously activated. In the example considered here, some concepts in the text activate preexisting concepts while some are instantiated as new patterns of activity in close proximity - the man has learned something new. Suddenly, all these active patterns cause the induction of a new pattern somewhere in his higher association cortex. This pattern represents a thought the man has never had before and a concept that wasn't contained in the text he is reading - the man has a flash of insight. The firing of the new pattern, perhaps together with the activation of a number of other patterns, causes a mismatch with the top-down predictive activity which is detected in the BG. The current 'reading' plan active in the prefrontal cortex is suppressed as it passes through the striatum and control passes to a new plan, involving activating as many concepts related to the new induced concept as possible - the man is thinking through the implications of his recent insight. Many sequences of related concepts are activated in a series of chains or nets. Soon a number of important concepts are activated, concepts that represent money, or power, or sex etc - the man sees how to exploit his idea for personal gain.

The use of the word 'see' in this context is common. Just as we see in physical 4-dimensional space/time with our visual cortex, activating feature and object representations as we scan the scene, so the cortex can see with its 'mind's eye' in the higher-order information (feature) spaces that are mapped out in higher level association cortices. Existing concepts can be activated by abstracted sensory input or the process of recall. Because existing concepts are patterns with connections to many other patterns (representing the defining relationships of the concepts) this process is fast and sure, just like seeing - we can call up any memory (especially if it's a well-rehearsed one) at will, activating them in sequences to replay 'movies' in our heads of particular situations. If we try to think of situations that we haven't experienced, trying to formulate plans or predictions for the future, for example, the process is more often described as 'feeling' - having an intuition or gut feeling about a situation or hypothesis. This is because the relevant area of feature space is not thoroughly 'mapped out' as memory representations. When we are recalling and thinking we see in the light of past experience and reason. When we are planning we feel into the darkness of an unknown future.

The activation of behaviorally significant concepts such as money and sex causes the firing of well-established patterns in limbic cortex. These patterns cause direct activation of subcortical centers. In one such center are (basal forebrain) cholinergic neurons that project diffusely throughout the cortex. They release acetylcholine which blocks potassium channels in cortical neurons. This results in depolarization and a loss of adaptation (adaptation is a decrease in firing frequency over time). This effect alone causes an increase in learning; learning is an increase in synaptic strength due to repeated activation - if each cell maintains its firing over a longer period (due to loss of adaptation) then increases in synaptic strength will be facilitated. In this way when behaviorally significant concepts and features are activated, learning is 'turned up' in cortex to make sure that the situation is memorized, whether the outcome is good or bad. In addition, limbic cortex projections to subcortical 'emotion centers' such as the amygdala and hypothalamus produce activity in these areas. As described above, these subcortical areas project via the raphe nuclei into the BG, where they directly influence the selection of behavior.

In this way, perhaps also through the activation of patterns in orbitofrontal/prelimbic cortex which can influence future behavior through their projections to the BG as outlined above, the realization that the man can satisfy one of his prime drives leads to a suppression of the current active pattern in prefrontal cortex and the activation of a different plan (likely after some 'searching' to find a plan that predicts the fastest realization of the man's new goal) that causes him to drop the book and leave the room.


Copyright Paul Bush 1996 all rights reserved etc etc.


Any comments or questions? paul@phy.ucsf.edu

Previous questions and answers