Human brain theory

ISBN 978-3-00-068559-0

Monografie von Dr. rer. nat. Andreas Heinrich Malczan

18 Pattern recognition without neural networks

Created on 07.10.2024

Artificial intelligence and artificial neural networks are experiencing an unprecedented boom worldwide. Artificial neural networks with an extremely large number of hidden layers, also known as "deep neural networks", can have a huge, almost unimaginable number of neuron layers. They can be provided with vast amounts of information during the learning phase (deep learning) so that they can solve increasingly complex problems. In many applications, they can already solve problems better than humans.

 

Theorists and developers of artificial neural networks could therefore quickly come to the conclusion that the human brain uses precisely this technology with its real neurons.

It is reasonable to assume that the brain works precisely according to these methods and links its neurons together in exactly the same way as in artificial neural networks. And that the same learning processes take place in the brain as in deep learning in deep neural networks.

 

Of course, it has now been established that artificial intelligent systems also have a tendency to fabricate, hallucinate or, to put it more maliciously, simply lie, and that these effects can even be easily demonstrated. However, it is also believed that this misbehaviour can be limited, reduced or even eliminated. This may be true.

 

The main capability of artificial neural networks is pattern recognition. Given enough input, an artificial neural network learns the patterns present in the input set and can recognise them, even if they are superimposed by interfering input. There seems to be a widespread belief that without such neural networks - whether artificial or real in the brain - pattern recognition would not be possible.

 

This must be vigorously contradicted. Pattern recognition can also be organised completely differently in the real brain of vertebrates.

The author intends to prove this thesis using examples. Sufficient knowledge of the primary visual cortex and the pontocerebellum is assumed. I recommend reading the following chapters from my monograph "Brain Theory in Humans":

-        4.2 Divergence modules with lateral signal propagation

-        4.3 Modules with spatial signal propagation

-        8 The memory module in the pontocerebellum,

which are also provided on this website.

 

In the following subsection, we will define signal vectors that are suitable for recognising specific objects in any set of visual objects without the need for a neural network or any learning processes. All that is required is a neuronal circuit in the brain, the existence of which has already been indisputably proven.

We should get used to the idea that the vertebrate brain is a conglomerate of very different circuits, each of which makes a specific contribution to signal processing and signal recognition. The view that the brain is structured like an artificial neural network urgently needs to be revised.

However, this does not mean that there are no neuronal networks in the vertebrate brain that at least resemble artificial neuronal networks. There are substructures in the brain for which this is true. However, the design principles in the vertebrate brain differ significantly from the design principles of current artificial neural networks (as of November 2024). A particular strength of the natural neuronal networks in the vertebrate brain is the ability to create logical connections between signals that are not formed and consolidated through learning, but which are inevitably formed by the network structure of the natural neuronal networks in the vertebrate brain.

Logic in the brain is not learnt, it is caused by circuitry! This will be demonstrated in detail over the next few months.

But now an example of pattern recognition without AI will follow.

 

18.1 How the brain recognises polygons

 

We begin our observations with the sense of sight. The retina projects via the visual thalamus into the primary visual cortex.

An associated cortex field corresponds to the visual field of vision.

 

Now we look at the output of the primary visual cortex and restrict ourselves to the orientation columns. Each orientation column corresponds to exactly one orientation of an inclined straight line in the field of vision, i.e. an angle of ascent. Only if an inclined straight line appears in the field of vision - and additionally exactly in the pixel to this hypercolumn - does this orientation column fire strongly, but only if the angle of ascent assigned to it is actually present. If the angle of rise deviates only slightly from the target value, the neuron of the orientation column also fires. In other words, it is not necessarily an angle that is measured, but an angle interval in which the angle of ascent of the straight line must lie.

 

We assume here that the new signals from the visual cortex obtained in this way represent a new modality: Line elements. A line element is a straight line segment with a specific orientation, i.e. an angle of ascent.

New modalities establish a new level, a new segment, in the rope-ladder nervous system - and the vertebrate brain is topologically still such a rope-ladder nervous system.

Therefore, the orientation columns project as output neurons into a new, secondary visual cortex area.

This then contains all the orientations of all the visual hypercolumns.

And we assume that the new modality is topologically well-ordered. The angle of ascent serves as the ordering criterion.

If there are (theoretically assumed) 36 different angles of ascent that can be detected by the orientation columns, then a straight line can assume an angle in the interval from 0° to 180°, with each orientation column representing an angle interval of 5°. (Angular resolution: 5°).

The projection from the primary to the secondary visual cortex is organised precisely according to these angles.

This means that the secondary visual cortex contains an elongated strip of neurones consisting of 36 retinal images that are simply arranged one behind the other. The signals of the orientation columns that correspond to the orientation angle of 0° (more precisely -2.5° to +2.5°) arrive in the first retinal image. The neurons are arranged retinotopically, i.e. they reflect the entire retina in their arrangement, but the input comes from all orientation columns with an angle of 0° ± 2.5°.

In the adjacent retinal image, the signals of the orientation columns assigned to the angle of 5° (more precisely from 2.5° to 7.5°) arrive. This preserves the topology of the retina.

In the last retinal image at the end of the strip, all retinal points are shown again, but only those neurons fire for which the orientation angle was 180° (more precisely 177.5° to 182.5°).

How long and wide is this secondary surface of the orientation columns?

One hypercolumn in the primary visual cortex corresponds to exactly one cortex neuron in the secondary cortex per detected angular interval. If there are (hypothetically) 200 rows with 200 columns in the primary visual cortex, whereby a hypercolumn of approx. 0.5 millimetres in width is assumed for each row and column, the corresponding field in the secondary cortex consists of a strip of 36 fields, each consisting of 200 neurons per row and 200 such neurons arranged in rows one below the other. Now, 200 hypercolumns require about 100 mm of space in width, whereas 200 pyramidal cells require perhaps only 5 mm, i.e. one twentieth of that. And 36 such systems are a maximum of 180 mm wide.

Therefore, the projection of the orientation columns into the secondary visual cortex forms a narrow, long strip of cortex neurones. It is used for outline recognition, i.e. ultimately for shape recognition.

 

But now let's remember our aim of realising shape recognition without neural networks!

According to the author, the vertebrate brain uses the mean value system for this purpose.

We first assign exactly one average neuron to each of the 36 angle images in the secondary cortex. It taps into all the neurons of the angle image and uses them to form an average value for this angle.

This evaluation system provides a total of 36 output signals that can form a signal vector. We refer to the signal vector organised according to orientation angles as the angle signature of the system.

We require that the mapping of a triangle is determined as clearly as possible by the angle signature.

What does the angle signature of a triangle look like?

-        There are exactly three positions of the signal vector greater than zero, but all others are equal to zero. This is because the triangle consists of three straight lines, each of which has a different angle of ascent.

-        If the triangle in question is moved back and forth in the field of view, the angle signature remains unchanged. It therefore does not matter where the triangle is located in the field of view.

-        If the triangle is reduced or enlarged, the angular signature is retained, but the averaging for larger triangles makes each vector component slightly larger, whereas for smaller triangles it becomes smaller (scalar multiplication, because the averaging of neurons is not a true averaging, but increases (non-linearly) with the sum of the excitations supplied until a neuronal saturation limit is reached.

If the maths teacher makes it clear to the pupil that he sees a triangle, the pupil will learn this.

But if the teacher shows the pupil only three straight lines that intersect at the centre of the original triangle instead of the triangle, with the sides of the triangle shifted parallel to themselves, then the pupil will think that this figure is a triangle.

So the student will be fabulating, hallucinating or simply mistaken? Provided that he only uses the described mean value system of the orientation columns and its angle signature for the analysis.

Why is this the case? Only angles of line elements are analysed. It is not necessary for the three analysed straight lines to form a triangle.

And it gets even worse: If instead of one triangle, ten such triangles - possibly of different sizes - are distributed in the field of view, the same output, the completely identical angular signature, is produced.

Nevertheless, such a system can bring significant advantages. Its output only requires the non-linear signal propagation, as a result of which the orientation columns are created in the primary cortex, which can perform an angle analysis. And as a second component, only the mean value system of the cortex is needed.

For example, if an object activates all orientation columns because inclined line elements assume all possible angles, the result is an output vector that is maximally energised in every angular position. Let's imagine an insect that has this circuit. And which allows the components of the angular signature to converge on a common output neuron, so that this in turn forms a kind of mean value.

This insect could be a bee, a butterfly or a wasp.

If this animal is presented with a visual object in the form of a daisy, a dahlia or a flower with narrow, elongated petals arranged around a flower centre, the described analysis system will be maximally excited. The animal therefore receives a recognition signal: flower present. Maximum excitation occurs for all orieation angles because all these angles are occupied by the petals. Is this why flowers are arranged in this way? The evolutionary advantage is obvious: they are found and pollinated by insects! Has the flower shape adapted to the visual evaluation system of the orientation columns, whose essential prerequisite is merely exponential signal attenuation and averaging?

The system of orientation columns is almost certainly found in the fungal body of insects. This hypothesis still needs to be proven! I wanted to describe this circuit in insects years ago, and it is similar to the cerebellum circuit.

Using the active centre of gravity system, analogous to the tectum of vertebrates, the insect can even determine the direction and find the flower.

 

Vertebrates have several such mean value analysis systems.

In addition to angles, corners can also be detected; here you can also sort according to the size of the included angle in the relevant corner. In this way, a corner signature can be generated as output. A triangle would also have three occupied vector positions greater than zero in the corner signature, the others would be equal to zero. If this is combined with the angle signature, triangles, squares or generally angular figures can be recognised and distinguished. All this without neural networks and without learning.

If the output is fed to the cerebellum, these forms can be learnt and recognised. In this case, however, the cerebellum works in a similar way to an artificial neural network, but also differs in its construction.

 

The described circuit with the angle signature can therefore recognise triangles. But it can also hallucinate!

We take the figure of the triangle as the starting figure and change it by moving the three sides of the triangle so that they each run (with the same length) exactly through the centre of gravity of the previous triangle. The result is three intersecting straight lines.

If you offer this visual object to the described evaluation system, you will receive exactly the output vector that the original triangle delivered. And you will receive the message: "Triangle recognised".

But this time there is no triangle at all, just three intersecting straight lines.

Our system recognises the geometric invariants and reports their detection. However, the invariants are not the triangles, but the existence of three angles of ascent. If we interpret the output vector as a recognition feature for triangles, we are making a mistake, not the evaluation system. When fabulating or hallucinating, invariants are recognised that we erroneously assign to an object class to which they do not belong.

An AI system can recognise statistical features in a set of elementary signals that are assigned to objects through the learning process, but to which they are not always assigned. In the case of triangles, we have recognised the invariants: the system only recognises different angles of ascent (gradients), but not triangles.

Is the angle signature useful in the neuronal system?

It is certainly useful. It is not the only information available. But it is a geometric invariant. Mathematicians will easily recognise that the angular signature of a person's face is a constant. Regardless of whether this face can be seen at a distance of one metre or five metres. The angular signature is independent of distance. And it is independent of displacements. And a rotation of a visual object only causes a cyclical shift of the vector components in the angle signature. Increasing or decreasing the size of the object only causes a scalar multiplication of the angle signature.

So the angular signature in the vertebrate brain is so advantageous that it is actually used.

If the averaging is not carried out over the entire retina, but over a cluster division (e.g. ten clusters next to each other in ten rows below each other), the angular signature is retained, but allows a more precise spatial assignment with regard to the position in the image field. Nevertheless, such a system can also hallucinate if it links the angular signature with incorrect objects. This is precisely what AI systems apparently do.

At this point, it would be appropriate to compare machine pattern recognition and the pattern recognition described above using the orientation columns. Automated processes have been used for a long time to recognise texts automatically, for example. Everyone is familiar with the possibility of having completed bank transfers scanned by machines at the bank, which is significantly faster than entering the data manually using the device's keyboard.

The mathematical apparatus behind it is realised by two-dimensional Gabor filters, this method also exists in a discrete form. It is named after Dennis Garbor, who used it for one-dimensional signal analysis. It was later extended to two-dimensional analysis by Gösta Granlund.

In principle, the Gabor filter for image analysis is a convolution of a plane wave with a Gaussian function. These Garbor filters can be used for recognising edges and structure in images, preferably also as an aid for character recognition.

By selecting the direction of oscillation of the plane wave and its frequency, when applied to a given image it is possible to recognise, for example, at which points in the image an edge occurs that has this angle of rise α. If you want to analyse the image for a different angle of rise β, a new Garbor transformation with this angle β in the Garbor parameters is required.

We remember the 36 retinal images that are present as a continuous narrow strip of neurones in the secondary cortex. Each of them is assigned to exactly one angle of rise interval and contains active (firing) neurons wherever an orientation column in the primary retinal image recognises exactly this angle and therefore also fires.

We could interpret each of the 36 retinal images as the result of a Gabor transformation in which exactly the associated angle of rise of a line element was detected. A total of 36 Garbor transformations of the retinal image would therefore have to be carried out in order to achieve the result in the secondary cortex with its 36 angle-specific retinal images.

The real problem, however, is that neurologists will find it difficult to deal with the theory of the Fourier transformation, the theory of the Garbor transformation, the process of convolution of functions, Gaussian functions and harmonic functions in order to understand the mathematical relationships. And you can't really ask for it, it would be an imposition.

On the other hand, it is equally understandable that the theoreticians of signal analysis rarely have comprehensive knowledge of vertebrate brains. This naturally makes collaboration between these disciplines more difficult.

It would be worth thinking about whether the principle of visual orientation columns in the primary visual cortex could be reproduced microelectronically. This is because the brain delivers the result in two steps, and simultaneously (in parallel) for all pixels. The first step is the propagation of the input signals from the input neurons to the output neurons, where the non-linear signal attenuation occurs. According to signal theory, this non-linearity is essential for pattern analysis. The second step is averaging over an image area or the entire image to determine the angle signature. This could be used as input for a classic neural network, which could also perform deep learning with a sufficient number of hidden layers and would therefore be able to recognise patterns. The artificial neural network would therefore be fed with already processed data instead of the original input of the image to be analysed.

The current numerically very complex Garbor transformation or discrete Garbor transformation could then be omitted. Savings in computing power, storage space and a gain in speed would potentially be significant. Considering that AI systems now require huge amounts of electrical energy, it would be possible to protect the environment and the climate in this way . In this way, brain research could even make a contribution to climate protection.

In this respect, it is advantageous to clarify signal processing in the brain as comprehensively as possible. This monograph is intended to make a modest contribution to this.


Monografie von Dr. rer. nat. Andreas Heinrich Malczan