Signals in space and time

16 Signals in space and time

Neural signals are constantly changing. Any analysis must take this time-dependence into account. In artificial transformers, all signals that are active simultaneously at a given point in time are grouped together into a single unit: the token. As each individual signal is indexed, a token can be represented as a vector. In the vertebrate brain, too, simultaneous activity patterns can be formally interpreted as tokens.

16.1 Tokens as ephemeral units of information

Tokens are the smallest units of information in both artificial and biological transformers. They always represent only the state of a system at a single moment. A token is therefore not a permanent quantity, but a snapshot of the activity pattern of many neurons in a token region.

If one tracks the activity of these neurons over a time series, a sequence of tokens emerges that maps the temporal change in the signal. Artificial transformers always require such a token sequence to recognise temporal structures.

In order for the brain to extract temporal patterns from these short-lived snapshots, it requires mechanisms that analyse signals across multiple time steps. The vertebrate brain has developed two fundamentally different technical solutions for this:

short signal echoes – basal ganglia
long-lasting rotational signals – Papez circuits

16.2 The basal ganglia: short signal echoes

The basal ganglia generate short echoes that last for just a few time steps. They are particularly suited to:

rapid sensory processing
motor control
language
visual and auditory sequences

This short-term memory arises from propagation delays along poorly or non-myelinated axons. These delays are in the order of approximately 10 to 50 milliseconds, depending on the sensory modality (movement, hearing, vision).

Motion detection

The delayed signal is switched to an inhibitory transmitter and inhibits the current presence signal in the thalamus or nucleus ruber.

No change → both signals cancel each other out
Change → a residual signal remains

Linking the present and the past

Since the echo still exists when the current signal has already faded, it can be combined with new signals. This creates a mechanism capable of learning temporal sequences – essential for language and sequence processing. To achieve this, however, the inhibitory echo must be converted into an excitatory one.

16.3 Theorem of the two evaluation systems of the basal ganglia

The basal ganglia possess two parallel evaluation systems:

Signal pathway	Type of output	Function
1	Inhibitory	Differential circuit for detecting movements and signal changes
2	Exciting	Combination of delayed and current signals for sequence formation

This architecture forms the basis of biological transformer circuits with short response times.

Detailed descriptions can be found in the author’s monographs (available online):

Brain Theory of Vertebrates (2020), Chapters 3.8 and 6
Brain Theory of Humans (2021), Chapter 5

16.4 The Papez Circuits: Long-lasting Signal Echos

The long-lasting echoes of the Papez circuits are not based on axonal delays. Instead, they are self-sustaining rotational signals in closed neural loops.

As long as the loop remains active, the echo persists:

The signal rotates continuously.
It loses no energy, as it is actively regenerated.
It exists permanently until an external intervention erases it.

This creates a permanent signal store, which is ideally suited for:

Thoughts
episodic content
mental simulations
long-term context formation

The Papez circuits thus form a long-term temporal system capable of maintaining stability across many time steps.

16.5 How the circadian clock clears the rotation buffers

The suprachiasmatic nucleus (SCN) acts as a circadian clock. Its activity follows the daily rhythm and can:

deactivate the rotation loops at night
erase stored signals
provide an empty memory the next day

This makes it clear: the Papez circuits are not delay lines, but permanent memory loops that are reset daily by the SCN.

16.6 Two time systems → two types of biological transformers

Since the brain possesses two mechanisms for temporal signal retention, there are also two biological transformer architectures:

1. Transformer with short reaction time

Based on the rapid signal echoes from the basal ganglia → optimised for perception, language and motor function

2. Transformer with long reaction time

Based on the long-lasting rotational signals of the Papez circuits → optimised for thinking, planning, remembering

In this monograph, we first analyse the transformer variant that utilises exclusively the basal ganglia: the transformer with a short reaction time – the foundation of the rapid, time-critical processing of token sequences in the sensory and motor systems.

16.7 Original domain and image domain in biological and artificial transformers

Transformers – both biological and artificial – process temporally ordered sequences of tokens. However, this structural similarity masks a fundamental difference in the nature of the signals being processed. Whilst artificial transformers operate exclusively in a mathematically constructed image domain, biological transformers operate directly in the original domain of neural activity.

16.7.1 Biological transformers: processing in the original domain

In the brain, tokens correspond to the actual activity patterns of groups of neurons. These patterns are present as physical signals within the nervous system itself. Processing therefore takes place directly in the original domain, that is, within the actual neural dynamics:

The signals are the original membrane potentials and transmitter effects.
The networks that process these signals are physically present.
The coupling between tokens arises through real synaptic connections.
The weight matrices of the networks involved are designed to be broad enough to capture all tokens simultaneously.

The biological transformer thus possesses a global coupling structure: every token signal can interact directly with every other token signal in the entire sequence without requiring an additional computational layer.

16.7.2 Artificial transformers: image processing

Artificial transformers do not work with the original input data itself, but with vector representations calculated from this data. These vectors form an embedding space in which all further operations take place:

The original signals (e.g. words, pixels, sensor data) are mapped onto vectors.
The weight matrices operate exclusively on these vectors.
The width of the weight matrices is artificially limited to the token width.
Global interaction between tokens does not arise directly, but is established computationally.

To enable artificial transformers to recognise relationships between tokens nonetheless, they must first:

calculate scalar products between the vectors of different tokens
weight these values using softmax normalisation
only then establish the actual interaction

The coupling between tokens is therefore not physical, but is generated by a mathematical procedure that compensates for the limited width of the weight matrices.

16.7.3 Consequence: two different processing principles

Although both systems work with token sequences, they differ fundamentally in their architecture:

Feature	Biological Transformer	AI Transformer
Signal space	Original domain of neural activity	Mathematical embedding space
Weight matrices	Wide, capture all tokens simultaneously	limited to token width
Token interaction	direct physical coupling	indirect via scalar products + Softmax
Network structure	globally coupled	Locally coupled, global structure is simulated

These differences are not value judgements, but describe two technical solutions for the same task: the processing of temporally ordered token sequences.