Perhaps the most striking feature of the inferior temporal neurons is that many of their preferred shapes closely resemble our letters, symbols, or elementary Chinese characters. Some neurons respond to two superimposed circles forming a figure eight, others react to the conjunction of two bars to form a T, and others prefer an asterisk, a circle, a J, a Y … For this reason, I like to call them “proto-letters.” That these shapes are so deeply embedded in the preferences of neurons in the brains of macaque monkeys is quite amazing. By what extraordinary coincidence is this cortical stock so similar to the alphabet that we inherited from the Hebrews, Greeks, and Romans? The reading paradox reaches its apex with this mysterious resemblance between two worlds that we thought discrete: the depths of the monkey’s cortex and the clay, reeds, and parchment surfaces on which the early scribes first etched their scripts.
We can get some insight into the origins of these monkey proto-letters by examining the reason why they appear in the visual field. The most likely hypothesis is that these shapes were selected, either in the course of evolution or throughout the course of a lifetime of visual learning, precisely because they constituted a generic “alphabet” of shapes that are essential to the parsing of the visual scene. The shape T, for instance, is extremely frequent in natural scenes. Whenever one object masks another, their contours almost always form a T-junction. Thus neurons that act as “T-detectors” could help determine which object is in front of which. Other characteristic configurations, like the shapes of a Y and an F, are found at the places where several edges of an object meet—they characterize an object’s sharp corners and their orientation. The shapes I and 8 results from yet another set of object contours—when the object has curves and holes. All of these fragments of shapes belong to what is known as “non-accidental properties” of visual scenes because they are unlikely to occur accidentally in the absence of any object. If you throw a bunch of matches on the floor, it is unlikely that two of them will meet to form a T-junction, and it is even less probable that three of them will arrive at the configuration of the letter Y. Consequently, when one of these shapes appears on the retina, the brain can safely assume that it corresponds to the contour of an object present in the outside world.
If the cortex finds it useful to encode non-accidental properties it is no doubt because their combinations tend to be extremely invariant to changes in size, angle of vision, and light. When one picks up a coffee cup and rotates it in one’s hand, across a wide range of viewing angles the edges of the cup always form two opposing F-junctions. Even with one eye closed, it is virtually impossible to find the only angle at which the edge and sides of the cup are at right angles, so that F pattern vanishes—usually, another pair of F’s immediately appears at the bottom: In many cases, the list of ways in which the edges meet is an invariant that constrains object identification regardless of angle of presentation. Our primate nervous system seems to have discovered this invariant property and used it to encode shapes. In fact, visual scenes have many other non-accidental properties. Parallelism is one of them: it is unlikely that an image will contain two parallel segments unless they are the edges of a three-dimensional object. Other invariants have to do with spatial organization: if an object contains a hole, its projection on the retina will probably include a closed O-shaped curve. Visual invariants such as these are so distinctive that they have been firmly integrated into our nervous system. According to the California psychologist Irving Biederman, our memory does not store fully detailed visual images of objects. It merely extracts a sketch of their non-accidental properties as well as their organization and spatial relations.
Their extraction allows us, at first, to reconstitute the elementary parts that constitute the object’s three-dimensional structure (surfaces, cones, sticks …), and later to assemble them into a complete representation of the object’s shape. This code has the advantage of remaining consistent in the face of random rotations, occlusions, and other deteriorations of the image.
To support his hypothesis, Biederman gathered evidence to the effect that human object perception relies more on non-accidental properties than on other aspects of the image. For instance, if one starts with a line drawing of an object, and then deletes half of the contours, the impact on perception depends on whether its non-accidental properties are preserved or not:
- If one only deletes the line segments that link two vertices, and leaves the
non-accidental junctions intact, the object remains easily recognizable. - If all the non-accidental properties are deleted, recognition becomes virtually impossible
Likewise, when we have to decide it two objects are identical or not, the differences are obvious when they bear on non-accidental properties (for instance a letter “0” versus a figure eight). They are quite difficult to spot if they concern metric properties alone such as size (for instance an uppercase “0” versus a lowercase “o”). In collaboration with the neurophysiologist Ruffin Vogels, Biederman further demonstrated that many neurons in the inferior temporal cortex of the macaque monkey resist metric distortions of the image, provided that the transformations leave non-accidental properties intact.””
In summary, shapes that resemble Western letters, such as T, F, Y, or 0, were adopted by inferior temporal neurons because they collectively formed an optimal code, invariant to image transformations, and whose combinations could represent infinity of objects. It is probable that other shapes were added to this alphabet because of their biological relevance. For instance, Tanaka has observed that some neurons code for a black dot on a white back-ground—an eye detector, clearly an essential device in a social species like ours. Other neurons are sensitive to hand or finger shapes. Primarily, how-ever, the inferior temporal cortex relies on a stock of geometrical shapes and simple mathematical invariants. We did not invent most of our letter shapes: they lay dormant in our brains for millions of years, and were merely rediscovered when our species invented writing and the alphabet.
Excerpted from ‘Reading in the Brain’ by Stanislas Dehaene Page 137-139