********* This is a Legacy Site I keep for posterity. Here is my new site. *********

Home | Interests | Work | Art | Contact | Jump
Jeff Sale
"Lines Composed A Few Miles Above the Center for Really Neat Research..."

Heroic Behavior in Nature:

The Characterization and Quantification of Behavior in the Exploration of a Novel Space

Jeff Sale




This paper discusses the nature of self-organized critical states [1] in procedural task performance: 

  • They are a basic component of the process of learning a procedural task
  • They have a physiological or biochemical basis in at least two neural systems:
    • diffusion neurotransmitters involved in learning and memory such as nitric oxide
    • Neuromelanin in melanosomes of bioamine neurotransmitter-based neurons such as the pigmented midbrain dopaminergic neurons found in the pars compacta region of the substantia nigra
  • They may be characterized with electrophysiological measurements of bioelectric signals such as those from the scalp (electroencephalography, or EEG), from muscle activity (electromyography, EMG), and from the eyes (electroocculography, EOG)
  • Their existence suggests a basic ability and even tendency for establishing a foundation of non-examples upon which to build examples of the correctly performed task.
  • The way in which the critical exponent varies during transition through the critical state may help distinguish one type of learning process from another, such as self-directed learning vs. situated learning such as that found in traditional medical schools.


Examples of Critical Phases in the Acquisition of Learning Heuristics during the Exploration of a Novel Space

The data below appear random:

However, when plotted as an ordered histogram, we get:

Which, when plotted on a log-log coordinate system, gives us:

How do we interpret the slopes of these "lines"? Look at the following similar data:

Computer Usage Data from Risley Hall Learning Center, April 10-14, 1995

Event Size represents the number of minutes a user was in a particular application, such as word processing or email. The below image is the same data as above, but plotted on a log-log graph.

Yet, still more similar data...


Data collected during a typical session playing Super Tetris:


When plotted as an ordered histogram:

The behaviors represented by these data have one thing in common, they represent complex adaptive systems in an ongoing ever-evolving process, either as individuals or in the process of making meaningful associations, relatively unencumbered by any particular burden of demands on performance beyond that which they place on themselves. They are essentially exploring an novel space of possibilities, never experienced before.



Procedural Task Performance

Conceptually, a procedural task [2] consists of five basic components:

  • Developing a mental representation of the correctly performed task, or the target state,

  • Developing a mental representation of the current state of one's world (i.e., the self and the surrounding environment),

  • Developing a strategy to reduce and possibly eliminate the difference between these two states,

  • Making some basic and often crude attempts at performing the task by implementing the strategy, and

  • Recognizing, remembering, and reinforcing the re-creation of states which "show promise" as contributors to a successful strategy.

Procedural tasks are those whose successful performance depends on learning a particular sequence of actions, such as learning to master a video game, tie your shoe, drive a car, play an instrument, or develop a good golf swing. The nature of these tasks is such that an expert at a particular task often finds it difficult to put into words just how to perform the task so much better than a novice.

Procedural tasks involve more implicit non-cognitive habit-forming memory (Basal ganglia and midbrain regions). Tasks that probably have very little procedural components to them are the more purely creative activities such as poetry, daydreaming, imagining, or recalling an experience. These are usually referred to as declarative memory tasks, partly because you can put them into words or "declare" what they are about. Often their observable manifestation is almost entirely verbal. They involve more explicit memory (hippocampus and limbic systems) [2].

Most procedural tasks are actually a combination of both these memory systems, but are predominantly procedural or habitual in nature, by definition. As an example, consider the act of verbalizing a memory, such as reciting a poem. This act consists of a declarative component, i.e., the information (and the process of its recall) which represents the poem memories in the brain, and it also consists of a procedural component, i.e., the motor control involved in speaking. It may be that the process of learning the technical details of a language is mostly procedural, though one's eloquence with a language is more declarative. In general, there is extensive overlap and coupling between the neurons which make up these two systems, and it may be impossible to identify tasks that depend exclusively on one or the other. We usually must be satisfied with finding tasks that are mostly one or the other.

The Hero's Journey

It is the Hero's Journey we are interested in, i.e. the relentless endeavor to explore a novel space, a new world, to cross No Man's Land, the Badlands, the ZPD, and not only to explore but also to endeavor to return and share the experience as a teacher. The hero emerges from the great sea of mediocrity, so we are all part of the heroic deed in some way. But it is that rare event, the emergence of the Hero, which signals a breakthrough of the boundary, the reaching of a new plateau, even if only for a moment. This sea of mediocrity can be described as a system of weakly-coupled simple harmonic oscillators which become entrained and can resonate their part of the system to a ‘far-from-equilibrium’ state leading to the emergence of novel behaviors, i.e. heroics.

That's what we do when we are learning, we are breaking boundaries. Sometimes it feels as though we're out there suspended in midair with no idea what we're doing, groping for some sign that this behavior or that behavior will help us towards our goal. This is the place I call the badlands (the erosion patterns of badlands are VERY fractal). It's that part in the learning of a task where you're just starting to show signs of getting it, the occasional high score or pat on the back, but you're still only just starting to get it, and most of your performances are considerably less than perfect. We pass through this zone on our way to mastery, and it is very unpredictable, awkward, not easy. But then, "if it were easy, everyone would be doing it!" (Tom Hanks in "League of Their Own"). This is the hero's territory. We are all potential heroes; all we need to do is learn.

While observing my daughter simply running across the grass, I am struck by the way in which she is consistently "on the edge" between running and falling, barely able to maintain balance as she pushes her limits. It appears as though her frontal lobe cognitive gating mechanisms are loosely controlled, but her pattern recognition and reinforcement systems are very tightly controlled. It is something like opening the floodgates or easing up on the brakes of a fully revved dragster and slinging her whole being into an apparently unpredictable minimally stable state, an "almost out-of -control" state in which almost anything can happen. Yet, as part of the process she tries with all her wits to observe and evaluate and develop prioritized responses, one of which she then initiates and ultimately reinforces the re-creation of, depending on the desirability of the results. In other words, she really tries to avoid a ‘face-plant’, and she usually succeeds. Kids are smart about stuff like that.

The Model in Gaming

Many video games have increasing levels of difficulty. When a player masters one level, they move up to the next. When they first attempt a new level, performance scores are almost always low (the subcritical state). As they continue their efforts they make the occasional breakthrough with a large score. These rare unpredictable large scores (accompanied by the somewhat more frequent intermediate scores, but still most scores are low) indicate they have made the transition to the critical state. They have reached the "percolation threshold", a term used in complexity theory to refer to the point at which water in a coffee pot first percolates to the top of the coffee grounds. Depending on the nature of the task's degree of difficulty, they might "hover" in this state for some time (see Fig. 5), or ‘plateau’, at which point the game loses interest or is completed.
It is important to accurately characterize the degree of complexity in the task(s) to be performed. If they tasks are not complex enough, players may transition through a level without exhibiting significant variation in performance. An increased level of complexity can make a game more interesting to play and more challenging thus lending itself to a large enough sample size for acquiring data of the transition state.

Figure 5.

Real data acquired (top) and random data simulated (bottom)

Description: D:\Jeff\web\banyantree\jsale\soc\critlearn6b_files\CritLrn51.gif
Description: D:\Jeff\web\banyantree\jsale\soc\critlearn6b_files\CritLrn52.gif

Figure 6.

Ordered histograms of real data acquired (top) and random data simulated (bottom)

Description: D:\Jeff\web\banyantree\jsale\soc\critlearn6b_files\CritLrn53.gif
Description: D:\Jeff\web\banyantree\jsale\soc\critlearn6b_files\CritLrn54.gif

Figure 7.

Various states of criticality corresponding to different levels of difficulty during performance of a video game, in this case Super Tetris.

Description: D:\Jeff\web\banyantree\jsale\soc\critlearn6b_files\CritLrn55.gif

It appears that performance scores in this state occur over a wide range, from large to small to in-between, but which exhibit not a random behavior (Fig. 5), but rather an inverse power law behavior in which scores occur with a frequency inversely proportional to their size, i.e. small scores occur a large number of times, large scores occur a small number of times, and in-between scores occur at an inbetween rate. This is the scale-independent state so often referred to as one of the hallmarks of chaos, perhaps the same as that referred to as 1/f or "flicker" noise. Random behavior is not like this. (Fig. 6) Additionally, this should not be confused with the power law of practice [3,4] which refers to a power distribution of improved performance as a function of time. The critical behavior dealt with in this paper may be considered to be "buried" within data exhibiting a power distribution as a function of time, but has always been "averaged out" over time so as to go relatively unnoticed and unquestioned until now, to the author's knowledge.

Figure 8.

Conceptual diagram of inverse power law behavior during a procedural task. Data is sorted in an ordered histogram and plotted on a log-log scale.

Description: D:\Jeff\web\banyantree\jsale\soc\critlearn6b_files\CritLrn56.gif

Once the level has been mastered, all scores are large. This state may be in some ways analogous to the supercritical state, but only to a limited extent. The differences are discussed below.

Essential to the process is the nature of the error inherent in each performance. We know from experience that trying to reduce the difference between two distinct neural states involves some error. Sometimes we err beyond our abilities, sometimes too well within them. The critical state serves a useful purpose in providing performance scores to "choose from" over a wide range, some possibly too great or intense, others not intense enough, and hopefully others which are just right. This behavior might serve a useful purpose by allowing our pattern recognition systems to selectively and repeatedly recognize and reinforce performance on a particularly "desirable" scale, thus "acquiring the target". It is the error beyond our abilities which is both our greatest asset and our most delicate vulnerability. While we are not constrained to "creeping up" on a solution, nor are we constrained from grossly "overshooting" our target. We use this freedom to our advantage by recognizing and reinforcing its occurence, but risk its occurence in association with circumstances of great stress, potentially resulting in aberrant behavior.

Conflict Resolution and the Zone Of Proximal Development

Jerome Bruner has done pioneering work on cognitive conflict as it pertains to learning and the acquisition of knowledge. He argues that learning takes place when conflicts are resolved between what he believes to be the three main modes of representation; kinesthetic, iconic, and symbolic [5]. He has been strongly influenced by Vuigotsky's idea of the Zone of Proximal Development. (ZPD) This refers to the area of interaction between two people involved together in a learning process, particularly the interaction between a parent and child where the parent facilitates the childs learning experience in a particular structured way [6]. This structure consists of the parent first identifying the child's current abilities, then creating a task believed to be just within the child's range of abilities, at least in a few tries, and finally demonstrating the correct performance of the task. The first and second items may be thought of as the lower and upper regions of the ZPD. The child then proceeds to make an attempt, with varying, but hopefully measurable, degrees of success depending on how successful the parent was at identifying the extreme lower and upper limits of the ZPD. (i.e., the childs current abilities and the actual task, respectively). The ZPD is a give-and-take process, with constant iterative learning and assessment manifesting in both parent and child.

L. C. Vuigotsky and A. R. Luria [6,7] did pioneering research in conflict definition, recognition, and resolution. Luria puts things very eloquently regarding the nature of force and control in human behavior:

"Many observations support our view that the consideration of the voluntary act as accomplished by "will-power" is a myth and that the human cannot by direct force control his behavior any more than "a shadow can carry stones". The development of the voluntary processes comes about as a result of the elaboration of the various forms of behavior, the mobilization of the Quasi-Bedurfnisse to achieve his ends. Voluntary behavior is the ability to create stimuli and to subordinate them; or in other words, to bring into being stimuli of a special order, directed to the organization of behavior.

"Our research convinces us that such a control comes from without, and that in the first stages of the control the human creates certain external stimuli, which produce within him definite forms of motor behavior. The primordial voluntary mechanism evidently consists in the external setting, the production of cultural stimuli mobilizing and directing the natural forces of behavior. This external auto-stimulation is substituted by an internal one; and the "spontaneous" establishment of the complicated Quasi-Bedurfnisse seen in the adult are a result of the profound cultural reconstruction of the activity depending on the cortical apparatus, without which we could not understand the complex psychological functions."

- A. R. Luria, 1932

I suggest that it is this 'stimuli of special order', the ZPD, that is analogous to the critical state, the edge of chaos, or at least analogous to an increase in the variance of events in the critical state. Often the task is so simple that the child may master it after only a few trials. Measurable changes in the critical exponent might exhibit trends which correspond to different learning processes, if the task is difficult enough and worth the effort, such as differentiating between a self-directed learning experience and a medical student's experience in a traditional medical school. Intuitively, these appear to be two radically different learning states, with only the rare exception being the individual with masochistic tendencies who actually enjoys the punishment delivered in these wonderful institutions!

Underlying Neurophysiological Processes

There are considered to be many memory systems basic and essential to learning and task performance. Our interest is in two contrasting memory systems, which we'll call the cognitive and the habitual memory systems. They are extensively coupled and practically overlap in some areas (amygdala mediodorsal/basolateral, striatal patch/matrix, nucleus accumbens core/shell, periallocortical layers underlying neocortex). It is generally accepted that two coupled systems are necessary and sufficient conditions for chaotic behavior.

There are numerous examples of chaotic behavior in living systems. Until recently, there has been little convincing work on the existence of chaotic behavior in mammals. In the summer of 1994, researchers published evidence of in vitro controllable chaotic behavior [8,9,10] in both the CA1 and CA3 regions of rat hippocampal tissue[11]. They increased their control parameter, extracellular potassium within phsiological limits, which caused the neuronal firing rate to transition into chaos and become subsequently controlled and confined to a fixed-point attractor using a computer feedback mechanism. In vivo intracellular recordings show that during task performance, the midbrain dopamine neurons of the substantia nigra pars compacta exhibit varied activity often with a random appearance including spontaneous bursting and beating[12]. Simplified versions of these neurons were also the basis of an impressive nonlinear dynamical model of Schizophrenia[13] and later Parkinson's disease[14].

Diffusion-based neurotransmitter processes are ubiquitous in these two main memory systems, especially activity-dependent presynaptic facilitation of the nitric oxide (NO) system and related glutamatergic (NMDA) systems [15]. The dynamics of diffusion processes in simple physical systems frequently exhibit fractal-like structures[16]. The spatial amplification of encoded information through the process of NO diffusion is extended over time. It is perhaps involved in the process of defining subjectively-perceived time, both for the moment and the relative time of more long-term memories which are ultimately stored elsewhere.

In normal healthy individuals, there is a trememdous amount of intracellular neuromelanin in the midbrain dopaminergic region of the substantia nigra, which was there at birth in all of us, even albinos (noted for their lack of skin pigment melanin). Besides being from the same family as our skin pigment melanin, neuromelanin is an elusively complex biomolecule which has many interesting properties, such as being an amorphous semiconductor (??), paramagnetic (??), both a free radical scavenger and synthesizer (??), and of course photosensitive like it's skin counterpart, melanin (??). This molecule may be a crucial component in maintaining a robust state of key information encoded in the pigmented regions. It's amorphous structure, paramagnetic properties, and the electrophysiological behavior of the neurons it derives from suggest a role in the more diffuse non-specific nature of critical behavioral states.

In a visuomotor task such as playing Super Tetris, the non-cognitive habit-forming system is highly active and predominates during actual performance of very short-term processes (visually attending and manually pressing a key). During non-performance (at rest) the cognitive system predominates by considering more long-term values and options. However, both systems appear to be active and operating at all times in some significant way. Their interactions with eachother are also ongoing and almost always very stable. Only under certain circumstances do we observe clearly the limits of this stability. These circumstances are familiar to us all, I hope, such as when we push ourselves beyond our abilities for prolonged periods in an effort to perform a task successfully. During the struggle, we often find ourselves in an odd state where the errors made on each attempt are different than the previous in an unpredictable way. For some strange reason, we will often struggle when it appears hopeless, and when we succeed we are ecstatic.

It is this vague abstract concept of ecstacy, activated in the limbic/cognitive memory system during the task performance, that attracts us to a state beyond our abilities, and in so doing introduces great instability in the striatal/habit memory system. This instability may be manifested in part in the activity of the midbrain dopamine systems. This information represents neither a good nor bad thing, but moreso just an expression of the current state, or perhaps more specifically both the difference between the current state and the target states as well as the information representing the target state operator's modifications. In other words, the dopaminergic neurons of the SNc may encode for the requisite reinforcement of the immediately preceding response, depending on how much closer it brought us to the target state, and it may also encode for the initiation and nature of the next response. It appears to be a key component of an adaptive mechanism tailored for repetitive task performance. Malfunction in this area should result in symptoms involving at least these two processes. We do observe this in Parkinson's Disease [??].

How this information is encoded is beyond our knowledge at present. However, the nature of the encoding process as it is observed physiologically, biochemically, hydrodynamically, and so on, might offer clues as to the nature of the information it represents. The intermittent random-like beating and bursting of these neurons [??] during the learning process suggest functioning at the "edge of chaos", and performance itself reflects this.

Do the Subcritical and the Supercritical States Really exist?

Under certain performance conditions of sufficient motivation and ability, rigid control of the current state "loosens" and gives way to a state in which the system's behavior manifests over a wide but bounded range of possibilities. The normally strong coupling of the two memory systems is weakened, perhaps due to a phase shift involving a slight but widespread lag of the habit system behind the cognitive system. It is suggested that overall performance and possibly physiological or biochemical subsystem performance exhibits 1/f "flicker noise" behavior. At this point, the system has entered "uncharted" territory, the "badlands", in the sense that there is either no reference or a severely limited one for predicting consequences of actions. This is the critical state, dancing on the edge of chaos. The current state's previous rigid control over its inherent instability is reduced, if not eliminated. With proper motivation, the recognition of a successful performance inspires the reinforcement of associations between the various subsystems active and accelerated at the time of improved performance. After sufficient reinforcement, control is regained, but at a new level of complexity.

This new state of consistently successful task performance appears to be analogous to the supercritical state in the sense that all behavior consists of one type, successful performance, with respect to a particular task. However, this state is distinctly different from the sand pile model in two ways. First, we are paradoxically in both a supercritical state for level 8 and a subcritical state for level 9, suggesting criticality is a relative concept. Second, in physical systems such as sand piles the supercritical state is transient, e.g., a large avalanche sends the sand pile back to subcritical status, and the system must build itself up again. We would require a great amount of energy (how much really?) in order to maintain the sand pile in a supercritical state, especially while adding new sand. Controlling such an unstable complex system would not likely be "cost-effective".

Before we accept this reasoning, let it not be under-emphasized that we are not just physical systems, and evidence abounds in both physical and biological systems for inefficiency. For example, the Sun's energy output; such a small part of it is used by life here on Earth as to seem insignificant, but we know better as to such insignificance. Our lives last briefly in cosmic time scale, Rather, it may be that the neural systems involved must change by extending their complexity so that a supercritical state of one level of complexity becomes a "basic" component of a more complex subcritical state of the system. It's as though a new level is created in a hierarchy of complexity separated by transitions through criticality. The hierarchy and the critical points are arbitrary in the sense that they depend on the nature of the task, which is arbitrary. It may be useful to imagine a very small sand pile which reinforces the grains to maintain supercriticality, and which gradually evolves into many such sand piles forming their own larger sand pile which may also be reinforced to become the basic component of an even larger sand pile, and so on.

It appears that if such a heirarchy exists, access to the information encoded at the various levels is confined primarily to the highest level. In other words, we cannot easily move down the heirarchy by unlearning a procedural skill. It is not easy to intentionally forget many and perhaps most memories, especially long-term memories.

This critical state may be so basic to the development of life that it may encompass much of the measurable behavior in newborns. Upon closer observation, it is possible that their flailings exhibit many distinct critical transitions in spatiotemporal complexity.

An additional observation about children and learning is that they seem to perform numerous "non-examples" (unsuccessful attempts) during the learning process. This model suggests that learning is significantly dependent on the creation of non-examples, in a sense being required to fulfill a certain minimum number of unsuccessful attempts before learning the right way can even begin, or perhaps it's just to "stir things up" neurodynamically. The subcritical state is simply the same kind of nonexample over and over again, whereas the supercritical state is the same kind of example over and over. How remarkable that we appear to have a mechanism (the underlying processes manifesting in the critical state) which provides a wide range of quasi-examples and quasi-non-examples to choose from and reinforce. In actively performing a nonexample, we may be establishing the stability necessary on which to place the culminating ever-evolving example.

A variation on these ideas might be more plausible in terms of energy cost/benefit ratios. It may be that all states of learning are critical but with varying ranges of possible states, such that what were referred to above as the subcritical and supercritical states might actually be critical states in which performance manifests over a smaller range, and what appeared to be the real critical state may simply be an expansion of the range of performance by some driving force or motivational component. This avoids the issue of supercriticality being energy inefficient.

Experimental Paradigm For Self-Organized Criticality In Procedural Task Performance

In Evoked Potential Research, the oddball paradigm is basic and well known [??]. Certainly the field has gone far beyond this simple paradigm, but this simplicity serves to make a point. This paradigm involves informing the subject of a particular task and requiring them to perform it as well as they can. The task in its simplest form involves recognizing and counting auditory tones of a particular frequency that are embedded within many more tones of a different frequency. The subject goes through a trial or practice session, and then proceeds to perform this task many times. The brief EEG epochs are recorded during the oddball stimuli, and then averaged, leaving a "typical" EEG representation of the state of recognition of the oddball tone.

The subject may often be given a trial period in which to learn the task, and these trials are thrown out because the data is ambiguous. This data is usually thrown out because of it's random appearance, with no apparent correlation with the subject's performance. I suggest that this is the critical state transition in which the subject is learning to master the task. In this state we should observe inverse power law distributions in correlations between these critical trials and the "typical" evoked potential of the mastered trials. In the case of the simplest oddball paradigm, the task is so simple that the critical state occurs only briefly, perhaps immeasurably. However, if the task is much more difficult, like an auditory analog to the Wisconsin Card Sort Task (a standard cognitive assessment test) [??], we might be able to "tease out" in great detail just how we traverse this critical state and how the percolation threshold changes with variations in the task components (Fig. 9).


Figure 9

A conceptual example of data from trials of a complex procedural task. The data, [[Theta]](t), are correlations between individual trials and the grand average of EEG event-related potentials recorded from the scalp, examples of which are shown at the top of the diagram, V(t).

Description: D:\Jeff\web\banyantree\jsale\soc\critlearn6b_files\CritLrn57.gif

We may also gain insights into how to define tasks that are just within a person's abilities so we may establish a more appropriate starting point for therapy. This is the basis for the new paradigm. There are three key components of this system worth exploring:

  • The researcher provides information to the subject which defines the task. This information is the raw substance with which the subject defines their target states. How do variations in this information affect the subject's behavior in the critical state? What do these behavioral changes tell us about the underlying mechanisms?

  • How do regular variations in the task affect the subject's behavior in and near the critical state?

  • Would encouraging the performance of non-examples actually hasten the learning process?



Copyright © 2015, Jeff Sale. (statement)

Valid HTML 4.01 Transitional