1 Fundamentals

1.1 Historical Context

Natural intelligence arises from the workings of nerve cells. Artificial intelligence was inspired by these biological principles.

Research into neural signal processing began with the discovery of action potentials and conduction via axons. In 1921, Otto Loewi demonstrated for the first time the chemical nature of signal transmission via neurotransmitters. John Eccles described the excitatory and inhibitory mechanisms of synaptic integration. In 1949, Donald Hebb formulated the learning rule, which remains the basis for neural plasticity to this day. In parallel, McCulloch and Pitts (1943) developed the first formal model of artificial neurons, followed by the perceptron (Rosenblatt, 1958). It was based on artificial neurons whose functioning was modelled on the fundamental properties of biological nerve cells. These fundamental properties still form the common basis for neurology and AI today.

1.2 Basic properties of neurons

Real neurons are signal processors. They receive inputs from receptors or other neurons and communicate via two forms of electrical activity:

membrane potentials (subthreshold excitations), which represent subthreshold excitations,
action potentials, which play a central role as an energy-efficient signal form – particularly when compared to the enormous energy requirements of today’s AI systems.

The fundamental functions of neurons can be summarised as follows:

Signal transmission: Neurons transmit signals along their axons and can switch the type of neurotransmitter used in the process.
Convergence: Multiple inputs are combined, giving rise to receptive fields and averaging neurons.
Divergence: An output signal can be distributed to many target neurons; divergence modules in the brain ensure broad signal distribution.
Modulation: Neurons exert excitatory or inhibitory effects on other neurons and modulate their output signal.
Storage: Through synaptic plasticity (Hebbian learning, LTP/LTD), neurons alter the strength of their synaptic connections – temporarily or permanently – and thereby store signal patterns.
Signal inversion: Neurons can inhibit average signals, thereby reversing the trend of the signal strength (reversal of monotonicity).

This brief overview should suffice. It illustrates that the fundamental signal processing functions of biological neurons form both the foundation of neurology and the source of inspiration for artificial neural networks. For artificial neurons, referred to as nodes in artificial neural networks, many properties of biological neurons have been adopted and, in some cases, simplified.

1.3 Signal detection by neurons

Real and artificial neurons can recognise signals that they receive via their synapses. If a neuron has n synapses, its input can be interpreted as a vector with n components. Each synapse is assigned a weight, which determines the strength with which the corresponding signal component of the input vector contributes to the output. Since the neuron has n synapses, it also has n weights. These form the weight vector.

The strength of the outputs is calculated by multiplying the strength of the input component by the weight at that vector position for each synapse. Therefore, the output strength is the dot product of the input vector and the neuron’s weight vector. This applies to both real neurons and artificial neurons.

For an input vector x and a weight vector w, the dot product y is calculated according to the formula

Every vector with n components is a vector in n-dimensional space. If a norm is assigned to each vector, the transition to Hilbert space takes place. The Euclidean norm corresponds to the length of the vector and is defined by the formula

If a vector x is divided by its norm, the result is a normalised vector x* with length 1:

By normalising vectors, they all acquire the standard length 1. They then differ only in their direction. For two vectors, e.g. a vector x and a weight vector w, the angle θ (theta) can be calculated using the cosine:

The similarity between two normalised vectors of the same dimension can be assessed using the angle. If the cosine is equal to 1, the vectors have the same direction and are identical up to a scalar factor. If the cosine is equal to 0, the vectors are perpendicular to each other and are completely different.

If the directions are identical, the vectors are equal up to a real factor. A small angle indicates a high degree of signal similarity. A right angle indicates that both vectors are orthogonal to one another and cannot have any similarity whatsoever.

1.4 Learning signal recognition

A neuron’s weight vector makes it possible to assess the similarity of any signal vector to its weight vector. The dot product is used for comparison. We now assume that neurons have the ability to learn. Therefore, there must be ways for the neuron to learn to recognise a signal. This occurs when, during a learning process, the neuron’s weight values adopt the signal vector of the signal into its weight vector. This is achieved through the gradual convergence of the weights towards the target values. In artificial neurons, mathematical optimisation methods (gradient descent, backpropagation, regularisation) are used for this purpose. In biological neurons, the desired weight vector is usually learned through Hebbian learning, but also through phenomena such as long-term potentiation (LTP) or long-term depression (LTD) and spike-timing-dependent plasticity (STDP). Normalising the weight vector can prevent the weights from growing indefinitely without disrupting signal recognition.

Neural networks consist of many neurons and are capable of recognising a wide variety of signals. Real neural networks in the brain learn through biological algorithms, artificial ones through mathematical algorithms. The ability to recognise signals within sets of signals forms the basis of natural and artificial intelligence. The various methods are explained below wherever specific real or artificial neural networks are presented in terms of their structure and functioning.

1.5 Weight matrices

When several or many neurons are interconnected to form a neural network, their weight vectors can be combined column-by-column into a matrix, which is referred to as the weight matrix of the network. As the number of learnable signals or signal patterns increases, not only does the number of output neurons increase, but more network neurons are also required. In most cases, the number of input vectors also increases significantly. In artificial neural networks, the weights in the matrices are real numbers that can also take on negative values. In biological networks, firing rates are generally positive, so the weight matrices are represented by positive real numbers.