16 Signals in space and time

Neural signals are constantly changing. Any analysis must take this time-dependence into account. In artificial transformers, all signals that are active simultaneously at a given point in time are grouped together into a single unit: the token. As each individual signal is indexed, a token can be represented as a vector. In the vertebrate brain, too, simultaneous activity patterns can be formally interpreted as tokens.

16.1 Tokens as ephemeral units of information

Tokens are the smallest units of information in both artificial and biological transformers. They always represent only the state of a system at a single moment. A token is therefore not a permanent quantity, but a snapshot of the activity pattern of many neurons in a token region.

If one tracks the activity of these neurons over a time series, a sequence of tokens emerges that maps the temporal change in the signal. Artificial transformers always require such a token sequence to recognise temporal structures.

In order for the brain to extract temporal patterns from these short-lived snapshots, it requires mechanisms that analyse signals across multiple time steps. The vertebrate brain has developed two fundamentally different technical solutions for this:

  1. short signal echoes – basal ganglia
  2. long-lasting rotational signals – Papez circuits

16.2 The basal ganglia: short signal echoes

The basal ganglia generate short echoes that last for just a few time steps. They are particularly suited to:

This short-term memory arises from propagation delays along poorly or non-myelinated axons. These delays are in the order of approximately 10 to 50 milliseconds, depending on the sensory modality (movement, hearing, vision).

Motion detection

The delayed signal is switched to an inhibitory transmitter and inhibits the current presence signal in the thalamus or nucleus ruber.

Linking the present and the past

Since the echo still exists when the current signal has already faded, it can be combined with new signals. This creates a mechanism capable of learning temporal sequences – essential for language and sequence processing. To achieve this, however, the inhibitory echo must be converted into an excitatory one.

16.3 Theorem of the two evaluation systems of the basal ganglia

The basal ganglia possess two parallel evaluation systems:

Signal pathway

Type of output

Function

1

Inhibitory

Differential circuit for detecting movements and signal changes

2

Exciting

Combination of delayed and current signals for sequence formation

This architecture forms the basis of biological transformer circuits with short response times.

Detailed descriptions can be found in the author’s monographs (available online):

16.4 The Papez Circuits: Long-lasting Signal Echos

The long-lasting echoes of the Papez circuits are not based on axonal delays. Instead, they are self-sustaining rotational signals in closed neural loops.

As long as the loop remains active, the echo persists:

This creates a permanent signal store, which is ideally suited for:

The Papez circuits thus form a long-term temporal system capable of maintaining stability across many time steps.

16.5 How the circadian clock clears the rotation buffers

The suprachiasmatic nucleus (SCN) acts as a circadian clock. Its activity follows the daily rhythm and can:

This makes it clear: the Papez circuits are not delay lines, but permanent memory loops that are reset daily by the SCN.

16.6 Two time systems → two types of biological transformers

Since the brain possesses two mechanisms for temporal signal retention, there are also two biological transformer architectures:

1. Transformer with short reaction time

Based on the rapid signal echoes from the basal ganglia → optimised for perception, language and motor function

2. Transformer with long reaction time

Based on the long-lasting rotational signals of the Papez circuits → optimised for thinking, planning, remembering

In this monograph, we first analyse the transformer variant that utilises exclusively the basal ganglia: the transformer with a short reaction time – the foundation of the rapid, time-critical processing of token sequences in the sensory and motor systems.

16.7 Original domain and image domain in biological and artificial transformers

Transformers – both biological and artificial – process temporally ordered sequences of tokens. However, this structural similarity masks a fundamental difference in the nature of the signals being processed. Whilst artificial transformers operate exclusively in a mathematically constructed image domain, biological transformers operate directly in the original domain of neural activity.

16.7.1 Biological transformers: processing in the original domain

In the brain, tokens correspond to the actual activity patterns of groups of neurons. These patterns are present as physical signals within the nervous system itself. Processing therefore takes place directly in the original domain, that is, within the actual neural dynamics:

The biological transformer thus possesses a global coupling structure: every token signal can interact directly with every other token signal in the entire sequence without requiring an additional computational layer.

16.7.2 Artificial transformers: image processing

Artificial transformers do not work with the original input data itself, but with vector representations calculated from this data. These vectors form an embedding space in which all further operations take place:

To enable artificial transformers to recognise relationships between tokens nonetheless, they must first:

  1. calculate scalar products between the vectors of different tokens
  2. weight these values using softmax normalisation
  3. only then establish the actual interaction

The coupling between tokens is therefore not physical, but is generated by a mathematical procedure that compensates for the limited width of the weight matrices.

16.7.3 Consequence: two different processing principles

Although both systems work with token sequences, they differ fundamentally in their architecture:

Feature

Biological Transformer

AI Transformer

Signal space

Original domain of neural activity

Mathematical embedding space

Weight matrices

Wide, capture all tokens simultaneously

limited to token width

Token interaction

direct physical coupling

indirect via scalar products + Softmax

Network structure

globally coupled

Locally coupled, global structure is simulated

These differences are not value judgements, but describe two technical solutions for the same task: the processing of temporally ordered token sequences.