6 Hands and Language

Language as it now exists has obviously evolved over many thousands of years. We cannot expect to provide an intelligible simple bridge from prelinguistic man to language in its present form. What we must look for is a theory of the origins of the basic form of language, from which it may evolve into what we see today, suitably supplemented. I am concerned here only with this initial evolution. But even here saltation threatens: how do the basic syntactic and semantic structures of language originate? What is the preexisting form from which language arises by incremental modification? Must we suppose that language arose by “separate creation”—that it had no roots in earlier preadaptations? Did a fully formed language module just spring into existence by sudden unprecedented mutation? Did human DNA experience a random convulsion resulting in creatures that could speak? Did syntax and semantics emerge from nothing, by a kind of spontaneous generation? Surely we want to avoid that conclusion if we can. But it is hard to see what the preadaptation might be, because language appears sui generis: irreducible, a domain unto itself. Our hypothesis is that the hand is the critical variable, but how do syntax and semantics emerge from the hand?

I am not here concerned with the evolution of thought (I will consider that later). I will assume that thought is already in place at the stage of man’s evolution we are considering. This was presupposed in the discussion of tool use, since instrumental thinking was taken to be present at that time. Our question, then, is how to get from the mental and physical structures present in manual tool use to language.1 (I actually hold that thought massively antedates language and is present in many species that do not possess any form of language, but I won’t be arguing that position here.) The question is how a communicative system arose from these antecedent conditions, including thought (which I take to involve the possession of concepts). If thought is already present, then intentionality is present: thoughts are about things and they ascribe properties to things. My question is how external public events (“utterances”) get to be about things and ascribe properties to things; it is not about the origins of intentionality in general. Thus I want to know how the hand could be the basis for reference and predication in a public language, given that reference and predication are already present in thought. It is agreed not to be the basis of reference and predication in thought, since many animals have thought without having hands (e.g., whales). But it might still be the basis of reference and predication in the external communicative system possessed by humans.2

We shall not see much hope of an answer along these lines if we insist on viewing language as essentially vocal. How could the vocal emerge from the manual? What has the mouth got to do with the hand? But this is obviously a simplistic and distorted view of the essence of language, because of the existence of gestural language and sign languages such as those employed by the deaf (e.g., American Sign Language). The linguistic cannot be identified with the vocal. Clearly the hand can function in a fully linguistic manner, silently.3 In addition, of course, we constantly use our hands communicatively during vocal exchanges in all sorts of ways. We lose nothing essentially linguistic if we suppose that early human language was (primarily) a gestural language, and there is good reason to think that this was the situation.4 The question then becomes how a gestural language centered on the hand could arise. What is it about the hand that makes it capable of linguistic structure? What are the preadaptations that enable the hands to refer and predicate? How did gestural language emerge from the hands? I am now going to describe three possible theories. They are not mutually exclusive but can be combined to produce a more complex theory; still it is useful to consider them separately to begin with.

(i) The SVO theory. Armstrong, Stokoe, and Wilcox believe that syntax can be found in natural actions of the hands.5 They invite the reader to swing his right hand in front of his body and catch with it his upraised left index finger. This action can be analyzed as follows: the agent of the action is the right hand, the object is the left index finger, and the action is the swinging and catching. They comment: “The grammarian’s symbolic notation for this is familiar: SVO” (179). So they see in this simple action of the two hands three components, corresponding to subject, verb, and object. The action “possesses a structure: in it something does something to something else, or SVO—the seeds of syntax” (181). Thus “because gestures of this physical type contain the structure of the basic sentence—whether symbolized ‘SVO’ or ‘NP + VP’—they also open the way to a more sophisticated symbol use than naming; they permit language to begin; they symbolize relationships” (181–182). The idea is clear enough: one hand can act as an agent, the other as an object, and an action can be performed by the one on the other. By observing such actions we can abstract a structural relationship that mirrors the SVO structure: we can separate out the components and grasp a relation between them. We have a structured sequence that resembles syntax. If we now interpret each hand as a name of something, the action can be seen as a (relational) predication. So it is not, they contend, that gestures can only attain the level of names; gestures can also function as verblike elements. The act-object relation mirrors the verb-noun relation. We might also cite the way thumb and fingers act on one another to form complex action configurations: the digits can be seen as namelike and the actions as verblike. All the fingers can combine systematically, like words in a sentence, and their actions contain the seeds of predication. The hands themselves have SVO syntax, according to these authors.

This theory is suggestive, if rather underdescribed. The hands certainly have combinatorial complexity, so that we can easily imagine reading genuine syntax into hand actions (as with contemporary sign languages). But the move from action to verb seems precipitous (saltatory)—surely not all hand actions are tantamount to verbs. Where does the symbolism come from? Also, the account is purely syntactic; it says nothing of how reference might be grounded in the actions of the hands. Nothing in the prehensive action of the hand finds a place in the SVO theory—the action of hand on object is left out of the picture. We don’t get the idea of predicating something of an object—just the joining of subject terms, object terms, and verbs into syntactic strings. Indeed, it is hard to avoid the impression that the authors are trading on a kind of use-mention confusion, conflating actions with the verbs that signify them. Can we do better?

(ii) The grip-action theory. The heart of this theory can be simply stated: the prototype of reference—its precursor—is the action of gripping an object. Referring emerges from prehending, crudely. Referring can be understood as “virtual prehending.” Consider holding an object in one hand and acting on it with the other hand—striking it, scratching it, or rubbing it.6 The gripping hand functions as the referring term and the gripped object is the reference, while predication corresponds to the acting hand. Here we have the makings of the syntactic structure of reference and predication and reference itself as a symbol-object relation. In a subject-predicate structure we have one element that links to an object and another element that applies something to that object: one element “takes hold” of an object, while the other performs the action of predication on that object. Similarly, one hand can take hold of an object, while the other acts on it in some specific manner. The dyadic object-directed structure is present in both.

The grip-action nexus is thus analogous to the subject-predicate nexus. Different actions can be performed on the same held object, as different predications can be made of the same object; and different held objects can be made subject (nota bene) to the same action (type), as the same property can be predicated of different objects. Subject and predicate are detachable and recombinable, as object and action are. We take the object in hand in order to do something to it, just as we identify an object in order to ascribe something to it. Predicating is an action, as is referring; gripping is an action, as is acting on a gripped object. We intentionally grasp an object in order to act on it, and we intentionally single out an object in order to comment on it. A bit of the world is selected in both cases, so that it should be made the subject of an act. I might pick an object up and act on it to show you how to perform a particular skill, for example, in teaching you how to make a stone chisel. Now there is a social dimension to the action. I chip at the stone so that you can observe how I do it to make a good chisel. Your attention will be directed to the held stone and then you observe the action in order to learn something. Similarly, I can “pick out” an object symbolically, say by pointing, and convey something to you about that object. If my symbol is a manual gesture it will resemble to some degree actually seizing an object in the hand (see below). Complex actions of gripping and acting-on will be common in a social group of tool users, and the grip-action theory sees in this the seeds of language in its most primitive form. It sees a platform (a preadaptation) from which reference and predication might get off the ground.7

Note also that gripping an object involves a special relationship between the body and the world, in the form of isomorphism. The hand must be shaped to the object: the fingers must be so configured that the object is properly held. The shape of the hand is different according to whether a power grip is used or a precision grip—as with gripping an axe versus gripping a writing implement. The fingers must adjust quite precisely to the form of the object (more strictly, its gripped part). Accordingly, the hand must be capable of as many configurations as there are geometrical types of held objects. The varieties of grip correspond to the varieties of objects (and varieties of actions performed with those objects). Thus it is possible to read off the type of object gripped from the shape of the gripping hand, because of the geometrical congruence. In this isomorphism we also see the seeds of representation: the grip is a kind of “picture” of the object gripped, a replica of it, a diagram.8 It would be possible to use a particular hand configuration as a symbol of the kind of object held by that grip in a gestural sign language. You might form the hand into the power grip used to hold an axe in order to symbolize an axe. Your audience would be familiar with that grip and be able to infer what you are referring to—they might then bring you an axe. The grip is a kind of mirror of an axe, and you can exploit this fact to obtain an axe.

(iii) The mimicry theory. In the grip-action theory we see the structure of language in embryonic form, and the way language takes us to the world beyond the body, but we don’t yet see anything that deserves to be called actual representation. So the suggested preadaptation needs supplementation, if we are to avoid inexplicable saltation. How does the hand become a symbol of something else? How can actions of the hand be interpreted as symbols by observers? Here the answer is not far to seek: the hand is capable of feats of mimicry. The hand can copy events and states of affairs in the external (to the body) world, and this copying can afford acts of communication. Let us take a very simple example: eating. Eating involves taking hold of food with the hand and inserting it into the mouth (we are supposing early man is a manual eater—no knives and forks). Suppose someone performs this action but without any food in the hand: she is mimicking the act of eating. This act of mimicry might be intended to indicate hunger on the part of the agent, or it might function as an incitement to eat by parent to child. Suppose the parent places food in front of the child and mimics eating it: the child will likely get the message that she should eat the food. Certainly the act of mimicry will bring to mind the action of eating, and then context will supply the intention of the “speaker.” Or consider the action described by Armstrong et al.: seizing the left forefinger in the moving right hand. We can easily imagine a context in which this action will be taken to mimic the seizing of a prey animal by a predator—where the prey might be a human. Thus when a member of the group strays toward a certain dangerous area or decides to wander around at night, another member might perform this act of mimicry to warn of lurking predators. The wanderer will (with luck) get the message, because of the similarity between the hand actions and the event simulated. The notable thing about the hands is that, owing to their remarkable versatility, control, and agility, they can mimic extremely well—the feet would be nowhere near as adept. Thus manual icons might develop based on mimicry—that is, standardized gestures that represent types of things.

One form the mimicry might take is assuming a characteristic grip in order to mimic holding a certain kind of object—as with assuming an axe grip in the example given above. I mimic gripping an axe in front of you so that you will bring me an axe. I have thereby referred to an axe. You can infer what I want from the grasping act I have mimicked. In a social group these kinds of imitative actions will have utility, so the ability to mimic will be selected for. The hands clearly have the potential to act as simulations, so we have located the preadaptation we sought. This potential just needs to be exploited in the service of a pressing need. To the enlarged brain, improved hands, use of tools, and social grouping, we can then add mimicry as an accomplishment of post-arboreal man. Man began to simulate nature with his hands, thereby communicating messages to observers. Perhaps he already had some talent as a mimic while still in the trees, as some contemporary primates do, but now he expands this ability, combining it with tool use and more elaborate social structures. The hands will have become more adept and flexible, so the range of mimicry will have amplified. In this we can discern the roots of representation. The stage will then be set for more abstract forms of simulation, and later for hand signs that are purely conventional. From those (not-so-humble) roots the mighty tree of language will eventually grow.9

As I said earlier, these three theories are not mutually exclusive; we can combine them. Suppose a member of our early tribe holds in his left hand some found object, say a piece of fruit or a dead bird; then he acts on that object with his right hand in a violent manner, intending to simulate the action of a predator. Here we have the SVO structure described by Armstrong et al., but we also have an analogue of reference in the prehension relation between hand and object, as claimed by the grip-action theory, and we have the element of simulation proposed by the mimicry theory. Thus the action represents a predator attacking its prey by referring to prey and predator and ascribing the action of attacking. Or we might imagine a member of the tribe biting into the fruit and then throwing it in a certain direction, so as to indicate that there is more of the edible fruit in the direction of the throw. Here the plentiful fruit yonder is referred to by holding a sample of it, and the direction of the fruit is indicated by the throw. Reference and predication thus emerge from a combination of structured hand action, manual prehension, and mimicry—syntactic form, objective reference, and iconic representation, respectively. Primitive gestural language thus proceeds on the basis of three preadaptations that are brought together organically.10 But all center on the hands, exploiting their capacity for combinatorial sequential structure, object prehension, and symbolic mimicry. This is quite a rich brew.

In fact, there are two other ingredients to add to the pot. We must not forget that the origin of language proceeds against a cognitive background. First, there is the simple fact that our early humans are already thinkers: they already instantiate the subject-predicate structure in thought. So this structure is already installed in the brain; what we don’t yet have, prior to the advent of public language, is its manifestation in an outer medium.11 What we are trying to show is that the hands are suitable for instantiating this preexisting structure externally. The hands thus externalize what is already internally realized; they do not create the subject-predicate structure ab initio. So a preadaptation for linguistic reference and predication is cognitive reference and predication (I would estimate that this goes back at least to reptiles and is premammalian). Second, we have already credited our early humans with specifically teleological thinking in relation to tools—they conceive the world instrumentally. Thus the way is open for them to conceive the hands as tools: tools for manipulating objects, but also as tools for other jobs. They might then see in the hands the potential for a vehicle of communication, given that they are already social beings with a need for communication. They begin to view their hands as symbolic tools—devices for communicating. They could then decide to exploit the mimicry potential of hands in communication, seeing this as a means to an end. A stone can be used as a tool for chopping; a hand can be used as a tool for communicating. The tool of language is accordingly constructed, following the general enthusiasm for tools that has seized the human animal since he departed his arboreal home. There is real creativity in this, no doubt, and all creativity is puzzling, but man has by this time long since been an agent of creativity in the construction of even quite primitive tools. In language he has created a new tool, based on an ensemble of resources he already possesses. He puts together these antecedent resources to produce a shiny new implement in the struggle for survival: a system of symbolic communication. Internal thought, bimanual anatomy, digital dexterity, a talent for mimicry, a prehensive lifestyle, a preoccupation with tools, social coordination—all these feed into the evolutionary process that produces primitive language. This is not something from nothing, but something from quite a lot.

The biological forms that precede and produce language are therefore multiple, and fall into two categories. On the physical or bodily side we have the anatomy and functionality of the hands in relation to the world beyond the body: this gives them the physical potential to function as a language (unlike, say, the elbows or the feet). They are free to be used at will and are capable of fine discriminations of movement. On the mental side we have a rich array of psychological capacities that can feed into language production: we have structured thought, intention, teleological rationality, tool consciousness, social coordination, and creativity. The suggestion is that if we join these two categories together we can begin to see how language might intelligibly emerge—not from a void by miraculous saltation, but from a rich and complex set of preadaptations. The transition is therefore smooth, not abrupt; incremental, not revolutionary (of course, the effects of language can be revolutionary, even if the origins are not). Man needs merely to apply his burgeoning tool-oriented intelligence to the physical capacities of his hands—they are the ideal means to the end he seeks, and they are right in front of his nose. He needs to communicate with others in his new cooperative lifestyle, and his hands are the perfect organs to be so employed. They already have the structure and functionality needed to operate as a medium of linguistic communication. Thus, in sum, the hands graduated by incremental steps from brachiating to tool using to talking.12 Talking was latent in the hands—a talent just waiting to show itself. The hands had the capability to act as sentences; it was up to their owners to exploit this capability—and they had the intelligence and desire to do just that.