10
Symbols as Scaffolding
In his preface to the workshop on “scaffolding” that preceded this volume, Bill Wimsatt wrote:
The generative entrenchment of an element is a measure of how many things depend upon it. Things that are particularly favored or robust … should be incorporated more readily and widely, and appear as deeper architectural features as they are built upon. Particularly favored should be structures facilitating production of a combinatorial alphabet [of] components that can serve as parts in structures of diverse types and [be] put to different uses. Symbols [among other things] … fit here…
This chapter explores one way in which symbols fit here. By any measure, symbols are generatively entrenched; many aspects of human nature and culture depend on symbols. However, while it is easy to see that language, mathematics, science, poetry, and so forth depend on symbols, it is far less clear exactly how they do so. It is one thing to know that symbols are at the core of these human activities, but it is quite another thing to know how symbols make them possible. Just what it might mean to say that symbols “scaffold” cognitive development is the topic of investigation here.
The metaphor of scaffolding suggests a temporary structure used to facilitate the construction of something else. In this respect, it does not quite align with Wimsatt’s conception of elements that become architectural features. In that metaphor, the generatively entrenched elements become part of something more permanent: symbols literally becoming parts of the permanent cognitive architecture. In the context of living organisms, a distinction between temporary scaffolding and permanent architecture may be possible only in a relative sense: some structures are more long-lived than others (see also Wimsatt and Griesemer 2007). However, development, and therefore change, is a life-long process. Even someone raised as a monolingual speaker of a language, for whom it might be thought that the “mother tongue” has become a permanent feature of his or her cognitive architecture, can come to prefer a different, late-acquired language for either general or technical communication. Despite this inherent vagueness in the concepts, I hope to illustrate below that the distinction between scaffolding and architectural features remains useful as we try to investigate the different ways in which symbols contribute to cognitive development.
Within the category of “symbol” I include all the various noises, marks, and movements that human beings (and perhaps some other animals) use as instruments of communication—that is to say, as means to orienting an audience toward, or acting upon, something other than the symbol itself. The point of saying or writing “tiger” or “2 is the only even prime number” or “anyone lived in a pretty how town” is generally not to induce an attitude toward the phonemes or graphemes themselves (although that may also be an intended effect, especially by the poet). Rather, the point of such utterances is to produce effects in listeners or readers that go beyond the act of sensory perception and categorization of these symbol sequences. Communication may sometimes be self-communication; thus a person who talks himself through a difficult situation or an accountant who makes marks in a ledger for her own later use need not be communicating with anyone else. Natural (i.e., human) languages are one class among many symbolic systems which can be used for communication. Many forms of animal communication also involve perceivable signals that serve to orient audiences toward things beyond the signals themselves. However, humans seem to be the only species that actively design systems of symbols and transmit them explicitly to their offspring. Furthermore, few if any other species exploit rich structural properties of their symbols systems to the extent that humans do.
A natural idea about the communicative function of symbols arises here: each utterer of a symbol sequence starts with an idea or thought and by means of symbols induces more or less a replica of that idea or thought in the audience. Couple this natural idea with the further idea that thoughts themselves are symbolic, and it can seem as though our various public grunts and scratchings are mere manifestations of inner symbols that are themselves elements in a language of thought. Call this pair of ideas “the standard view,” according to which it is inner symbols that give shape to outer symbols that in turn are reinterpreted in the receiver’s language of thought. The standard view is implicit in the ubiquitous P present in every attempt from Frege to Grice to Fodor to spell out what it means to say that “S means that P.” Whether the propositional P is taken to stand for “Platonic entity” (cf. Frege) or “Pre-installed by who knows what?” (cf. Fodor), defenders of the standard view have generated reams of symbol-covered pages attempting to answer the question of how to ground these inner symbols. For naturalistic philosophers, Frege’s Platonistic “solution” is off the table. And when Fodor (2008), convinced that concepts must be innate, figuratively offers the idea that they come from God, this tongue-in-cheek advertisement of medieval “divine illumination” theory is intended, given his naturalistic bona fides, to be unilluminating.
Perhaps philosophers will one day find a satisfying response to this question through the traditional method of reflecting on our symbol-using practices and constructing theories that attempt to satisfy intuitions derived from these reflections. However, I think success in this endeavor is unlikely without more sustained empirical investigation of those actual practices. Much of cognitive science, and the philosophy derived from it, has been conducted without sufficient investigation of this kind. This is because much of cognitive science has adopted (often explicitly) a rationalistic approach toward language and other symbol-using behaviors that typically treats basic symbolic capacities of human beings as given—part of the innate endowment of humans as such.
This is not the place to rehearse all the difficulties with claims of innateness. Rather, the point of this chapter is to describe some recent studies that provide more details about the nature of human and animal interactions with external symbols, and to indicate some consequences of such studies for our understanding of how symbols play a scaffolding role in cognition. These studies also point to a need for deeper understanding of the interplay between development and learning. Development is not to be conceived as the unfolding of a genetic program but as a dynamic process of adaptation to all aspects of the environment, including that provided by interactions with other symbol users. This conception will enable us to begin to link symbols to generative entrenchment in a reasonably specific way.
Perhaps in this context, the use of the term “cognitive scaffolding” should continue to be regarded as mere metaphor. And yet, I think that many symbolic structures do serve a kind of scaffolding role. These structures are ancillary to the building of cognitive structures or mechanisms, even if the resulting mechanisms can later be deployed without active engagement of those structures. Think, for example, of the practice of teaching young children the “times tables” by rote repetition. Although this practice is sometimes deprecated nowadays, it nevertheless can contribute to installing a kind of “random access memory” for the facts contained in those tables. I mean “random access memory” in the computer scientists’ sense, such that for the well-drilled individual to respond to the probe “What is three times four?” does not take significantly longer than responding to “What is eight times nine?” For such an individual who has sufficiently internalized the tables, it is no longer necessary to recite the entire three-times or eight-times tables to get to the answer. The memory structures that allow this kind of random access have been scaffolded by the public (and typically rhythmic) recitation of the tables—sequences of external symbols that potentially have a lasting effect on cognitive performance. Because alternative ways of scaffolding a given cognitive competence are often possible, it may be helpful to think of engagement with particular symbolic structures along the lines of Mackiean “INUS” conditions (Mackie 1974) for scaffolding the capacity—for example, chanting the times tables is an insufficient but necessary part of an unnecessary but sufficient set of cultural, symbolic practices for the development of competency with multiplication. It would be very hard for a human to develop such a competency without some such practice, although the efficacy of any particular form of engagement with public symbols is a matter for empirical investigation.
Because (barring a cerebral accident) the arithmetic student permanently retains the ability to recite the times tables throughout his or her lifetime, it may seem that this is not a clear example of scaffolding. Here are two different ways to think about this: (1) On the one hand, the notion of cognitive scaffolding may be understood relative to a specific task or process. Asked to do mental arithmetic, a proficient student can access memory of the specific multiplication facts without going through the tables. Nevertheless, as I shall explain further below, this relativized notion of scaffolding may be important for understanding the way in which symbols are generatively entrenched with respect to cognitive development. (2) On the other hand, we may point out that although the well-drilled student has internalized his or her times tables along with the individual facts, the public tokens are in fact no longer present. The times tables as public structures, that is, the specific token sequences of external symbols chanted in the classroom or printed on the worksheets, are scaffolding in the primary sense of temporary processes or materials that facilitate the construction of other structures. Among those structures are whatever neural configurations allow the tables to be reproduced, and, with sufficient practice, enable random access to each individual multiplication fact.
The view I promote in this paper focuses on the importance of external symbols as scaffolding for cognition. In some ways this view has affinities with ideas being popularized by other recent critics of the standard view, who have emphasized the way in which cognition may be “externalized” through the use of external symbols. The philosophical debate has run ahead of the science, generating considerable heat around the question of whether thought, cognition, or mental activity might actually be occurring outside the human brain or body. I am not so interested in this somewhat unilluminating metaphysical dispute. The undeniable point, it seems to me, is that without spoken words, mathematical terms and notations, pencil and paper, and other means of using external symbols, certain cognitive accomplishments would be very hard indeed if not completely impossible for creatures like ourselves. This includes the task of learning to do basic addition, multiplication, and even more fundamental operations like counting, but also derivative tasks such as finding the product of two large prime numbers. Similarly, the task of designing and constructing a nuclear-powered submarine would be impossible without the use of external symbols. Nevertheless, I want to sidestep arguments based on alleged parity between the exploitation of publicly observable symbols and symbol processing that is assumed to be going on inside the head. We need to know more about how interactions with public symbols shape the development of cognition before adopting any strong assumptions about the nature of cognition.
By emphasizing external, publicly observable symbols, I do not wish to deny that inner symbols might sometimes play a material role in thought. We do, after all, sometimes engage in inner speech (although, I suspect, some do it more than others, and perhaps for some philosophers it drowns out everything else). Similarly, to carry out tractable mental arithmetic problems, for example, 234 × 12, typically involves imagining the symbols written down (most likely, one above the other with a line beneath, if that was how one was taught to do it in elementary school) and then imagining the steps required to generate the answer (e.g., writing 2340 on the next row, then 468 below it, another line and finally the sum of those underneath). To date, no neuroscientist knows how to correlate specific imagination events to neural activity, so we have only the vaguest ideas about how, if at all, perceived and imaginable properties of symbols relate to inner representational format (see Pylyshyn 2003). Even if one day it will be agreed that the brain uses symbols to imagine symbols, what I want to emphasize now is the extent to which this kind of process appears to be anchored to perceptual features of external symbols. Even the more primitive operations involved in generating a particular product (such as 2 × 3) may involve remembering or generating an aurally or vocally encoded string, “two threes are six” (or “two times three is six,” or whatever relates to your specific experience). In these performances there seems to be no subjectively accessible layer of “machine language” encoding for the brain in which these mathematical facts have been resymbolized.
Admittedly, however, the foregoing is speculative. My goal in the following is to make the general position seem less so by describing some recent studies that indicate how deeply entrenched are the basic perceptual and motor processes underlying certain kinds of symbolic capacities. From the perspective that I am taking, external symbols are primary—if the notion of inner symbols can be validated at all, it is their internalization that is derived. One of the lessons I will endeavor to derive from recent experimental work is that external symbol structures leave profound marks on developing cognizers. These traces of early development are frequently not immediately recognized because we are often too focused on the symbolic elements themselves (what Wimsatt refers to as the “combinatorial alphabet”) rather than on larger-scale features of the temporary assemblages of public symbols that allow us to bring powerful yet basic perceptual and learning mechanisms to bear on the task of working with symbol systems. It is as if in studying scaffolding or architecture we have been too focused on the poles or bricks and insufficiently focused on the way in which they are held together.
Traces of Scaffolding
Scaffolding may leave traces in at least two ways. First, one may find the materials piled up in a corner somewhere, figuratively speaking, with the structure or couplings between the pieces more or less intact. This is the situation with adults who have retained the capacity to recite their times tables (perhaps a bit less smoothly than when they were younger) without needing to fall back on that capacity when asked for a product of two small numbers. Second, one may find traces of the scaffolding in the resulting edifice. For example, in some construction techniques, scaffolding need not be entirely freestanding but can be inserted into holes or crevices in the parts of the building already constructed. Forensic inspection of those holes or crevices may enable one to make inferences about the scaffolding (material and process) even after it has been fully removed. In this section I describe some recent studies that illustrate a kind of forensic approach. The target domain of these studies is symbolic reasoning, as in basic arithmetic, algebra, or introductory symbolic logic. In each case, apparent breakdowns in competent performance are traced to the operation of basic perceptual, motor, and learning processes that are the scaffolding for the symbolic performances.
I begin with a pair of papers by David Landy and Rob Goldstone (2007a, 2007b) in which they showed that usually competent college-level algebraists can be induced to make errors by manipulating extrasymbolic features of the formulas on which they are operating, such as spaces and stray marks. Landy and Goldstone (2007b) also showed that competent algebraists themselves generate such extrasymbolic features (especially using spaces effectively) when asked to write down formulas in the symbol system. Symbolic reasoning operations involve not just the recognition of individual symbols (“3,” “a,” “∀,” “+,” “→,” etc.) but also the identification of structure. Thus, to read and use the formula
a+7=9
properly, one must identify “a+7” as a group that is being compared to “9” on the other side of the equality sign. Simultaneously, of course, one must see “a+7” as itself having three components relating two terms and the “+” operator. Likewise, in algebra or arithmetic where operator precedence rules are established, the proper use or interpretation of “3+4×7” requires treating “4×7” as a chunk that is being added to 3. Correct interpretation of these formulas is supported by inserting parentheses that conventionally force a particular interpretation. Hence the difference between “3+(4×7)” and “(3+4) ×7.” However, the correct interpretation can also be supported in the absence of parentheses by introducing spaces. The formulas
a+7 = 9
and
3 + 4×7
contain spaces that are consistent with their correct interpretations under the operator precedence rules. The formulas
a + 7=9
and
3+4 × 7
contain inconsistent spacing.
Landy and Goldstone (2007a) demonstrated experimentally that small adjustments in the spacing between the various signs in a formula induced subjects who are otherwise competent in algebra to compute or reason using the perceived clusters rather than using their knowledge of the operator precedence rules. Thus, for example, the incorrect equality
a + b * x + y = b + a * y + x
was significantly more likely to be judged true if the spacing around the multiplication signs was slightly greater than the spacing around the addition signs. They similarly showed that mathematically irrelevant lines and shapes placed above and below segments of the formulas would lead subjects to make similar errors with respect to operator precedence if placed in ways that tended to create the perception of a group inconsistent with the proper interpretation of the formula. Another, more subtle case of perceptual chunking which led to precedence errors is the following equality statement where there are no spacing tricks:
(c*c*c) + (f*f*f) * (9*g+o) + (8*t+k) = (f*f*f) + (c*c*c) * (8*t+k) + (9*g+o)
Here it is apparently the internal similarity within the parenthesized groups that tends to lead subjects to treat the more homogeneous pair (c*c*c) and (f*f*f) as one group and the more heterogeneous pair (9*g+o) and (8*t+k) as another. (See figure 10.1 for this and other examples.)
Figure 10.1
On the production side, Landy and Goldstone (2007a) examined two tasks. One was to convert English statements such as “nine times one plus seven equals five times two plus three” into standard arithmetic notation in an experimental situation in the psychology laboratory. The other involved translating sentences of English into formulas of sentential or predicate logic on a logic tutoring website. In both cases, they found that students who were competent in mathematics (as measured by their performance in university courses or standardized tests) or logic (as measured by the correctness of their responses to the translation problems) were significantly more likely to insert spaces consistent with the correct semantic interpretation of the formulas, even though spacing is not part of the explicit convention. Among the logic students, those who spaced consistently were more likely to have provided the correct answer than students who used spaces inconsistently, and than students who did not use spaces at all. Interestingly, beginners in logic (those doing sentential logic translations) were more likely to use spaces in their formulas than students doing the more advanced predicate translation exercises, suggesting that they used them as an additional scaffold in understanding the expressions.
Arithmetic, algebra, and logic conventionally use parentheses and brackets to capture intended clusters. This is effective because a moderate number of parentheses effectively creates easily perceived groupings. However, too many parentheses can be just as confusing for the human perceptual system as too few (hence the antipathy to the programming language Lisp in some quarters, and the preference of many programmers for Python over Perl). This is why mathematicians and logicians have adopted explicit conventions governing precedence rules for operators even in the absence of parentheses. Nevertheless, the adoption and learning of those conventions are developmentally scaffolded. The existence of gaps or spaces between related groups of objects is a ubiquitous feature of visual experience, so from a developmental perspective it is not surprising that symbolic reasoning could be scaffolded on such an entrenched feature of basic perception. Given that appropriate spacing is spontaneously produced by competent symbolic reasoners, apparently without any explicit instruction or intention, it is also unsurprising that beginning students would implicitly learn to associate gaps that their teachers produce with meaningful divisions. In this way, socially mediated scaffolding may become self-scaffolding. As learners advance, they may also learn to use other features and operate without the spaces. However, traces of early reliance on spacing may still be visible even after years of mathematical training. Indeed, in unpublished pilot research for his 2007a study, Landy (personal communication) found that even Ph.D.-level students in mathematically technical fields were subject to the same violations of precedence rules when spacing and other clustering features were manipulated.
The studies conducted by Landy and Goldstone were not explicitly developmental, and their subjects were all college-age adults. I turn now to a more explicitly developmental study by Nicole McNeil (2007), which starts with the observation that approximately 70% of U.S. late first graders and early second graders (ages 6 years 9 months to 7 years 6 months) can solve at least one “equivalence problem” of the form “x + y = _ + z” (e.g., 1 + 4 = _ + 2). Remarkably, however, between second grade and the end of third grade, the success rate declines steadily until at ages 8 years 8 months to 9 years 1 month just 10% of students can solve at least one such problem. The kinds of errors these students tend to make include writing only the sum of the left two numbers into the blank or (more rarely) putting the sum of all three numbers. Performance does not fully recover until the end of fourth grade. McNeil (2008) reports that children in Chinese schools do not show this U-shaped developmental trajectory. So, what is going on in American schools, especially given that students are receiving intensive instruction in basic arithmetic during the period of decline?
Based on her comparison of the pedagogical materials and methods used in the Chinese and American schools, McNeil (personal communication) noticed two main differences in the way in which students are given addition problems. American students are typically given classroom exercises and worksheets where addition proceeds from left to right (e.g., 2 + 3 = _). Furthermore, such problems tend to be sequenced so as to keep one of the addends constant (e.g., 1 + 1 = _, 1 + 2 = _, 1 + 3 = _, etc.). In contrast, Chinese students routinely receive problems going in both directions (i.e., they encounter both 2 + 3 = _ and _ = 2 + 3). Furthermore, when such problems are sequenced, they are typically organized around a common sum rather than a shared addend (e.g., 1 + 6 = _, 2 + 5 = _, 3 + 4 = _, etc.). This suggested two possible experimental manipulations, which McNeil (2008) carried out. One group of children was given an hour of practice each week for three weeks on right-to-left addition problems only. Another group of children was given left-to-right problems organized around the common sum. With just three training sessions, both groups of children showed marked improvement in their performance on the “equivalence problems” (on which they received no direct training), with the right-to-left practice proving a bit more effective than common sum practice. In a follow-up study, McNeil et al. (2010) primed U.S. college students with addition questions and showed that their performance declined on equivalence problems such as 6 + 8 + 4 = 7 + _. So, although U.S. children are apparently able to overcome their early overtraining to respond to the + and = signs operationally, in a default “add from left to right” way, the adult traces of that early overtraining, itself based on generatively entrenched learning mechanisms, appear accessible to the kind of forensic examination that McNeil and her colleagues carried out.
The lesson I draw from these studies is that symbolic operations are scaffolded on even more basic mechanisms for detecting and learning about regularities in subjects’ experiences. Given practice with lots of examples where the task is always the same such as reading some numbers from left to right and summing them to fill in the blank, that pattern is overgeneralized to different kinds of problems where the response should be something different. Someone who makes this mistake may (the adult case) or may not (the second-grader case) know what the “=” sign “really means,” but the development of such an understanding in the adult requires first a more nuanced discrimination among the contexts in which “=” appears and learning what manipulations are permitted in those contexts. The external symbolic sequences are the scaffolding for building those discrimination abilities. To understand further, we must turn to a closer investigation of human perception of symbols.
Symbols as Stimuli
Landy, Allen, and Zednik (submitted) argue that symbolic reasoning mechanisms are built out of three primary components: (1) a physical medium with some notational marks on it, (2) a perceptual system accustomed to processing stimuli provided by the world in general and notational systems in particular, and (3) a motor system which can update those marks in ways that are guided by past experience. In the well-trained child, these three components frequently succeed in producing behavior that accords with formal mathematical and logical rules, through having learned appropriate ways to manipulate the notational structures. In general, we believe, symbolic reasoning is a uniquely perceptual affair: sensorimotor systems interact with and reproduce the actual perceived details of physical notations. The reasoning is internalized only insofar as the symbolic vehicles are directly imagined.
The extensive use of symbols is a cultural achievement that of course goes far beyond the kinds of symbolic reasoning activities seen in algebra and logic. Modern humans are born into an environment suffused with symbol-guided behavior, and they reliably acquire these behaviors as they grow up. The symbol systems in the environments in which each of us has grown up varies even to some degree between peers, and even more from the environments in which our grandparents were raised. Nevertheless, there is sufficient overlap to make sophisticated, precise, coordinated behavior not just possible, but likely. This cultural achievement has depended on a fair amount of conscious and unconscious tinkering with the symbol systems in which we are immersed. Consider, for example, how a dot notation for products may help with perceptual grouping—for example, 3 ∙ 4 + 7 vs. 3 × 4 + 7. Another example concerns how the height difference between operators and operands in arithmetic formulas helps reasoners latch onto their different semantic roles. The same height convention has to some degree been unconsciously adopted in symbolic logic—P→Q, P∨Q, etc.—although logicians have been less consistent than mathematicians in various respects; for example, “P&Q” flaunts the convention while “P∧Q” embraces it, even though both represent the same meaning. The effectiveness of the symbols in any given cultural context depends on their suitability for common processing by human-specific perceptual processes. Parasymbolic features such as spacing and relative size and complexity (e.g., that we typically write numerals with more curves than operator signs) contribute to the effectiveness of the systems we use. External symbols afford novel means of interacting with the world that are created and propagated according to their utility.
We have, I believe, systematically failed to realize the extent to which these symbols have been shaped by the peculiarities of human perceptual capabilities. Take music, for example, which exhibits a large range of cultural variation and is widely supposed to relate to the human capacity for imbuing sounds that we make with symbolic significance. It has long been supposed that there is something uniquely human about music appreciation, and the deep connections between mathematics and music are also of long-standing interest. Other animals show very limited interest in musical sequences that send humans into rhapsody. Thus, for instance, McDermott and Hauser (2007) conducted a study of responses by cotton-top tamarins and common marmosets to music that reinforced the consensus view, concluding that these monkeys preferred silence to all the types of music that they tested. However, the vast bulk of such studies, this one included, have failed to take into account the specific features of the human perceptual system that make our music effective for us. Bucking this trend, Snowdon and Teie (2010) built a different approach on a comparative analysis of the natural vocalizations of humans and cotton-top tamarins. This analysis indicated that monkey vocalizations are approximately eight times as fast and three octaves higher than their human counterparts. They surmised that human music (falling in the range of 40–208 beats per minute) might be too slow and low to be interesting for cotton-tops. They commissioned some species-specific “music” based on the higher tempos of the monkey’s own vocalizations and played it to the monkeys. Their results indicate that the monkeys showed sustained interest and affective (emotional) responses that were absent for the human music. (Science journalists and bloggers picked this up with headlines such as “Monkeys Like Metallica,” representing a bit of a stretch from the original findings.)
Some readers may be inclined to think that musical performances and musical appreciation are not bona fide cases of symbolic cognition. I don’t especially want to argue the point here. Rather, the main issue for the present is the extent to which stimuli that are more effective for humans than for other animals may be so effective for reasons having more to do with general aspects of temporal or spatial structure than to do with innate capacities for music, mathematics, language, or symbol use more generally. The case of music helps underscore the point that cognition is a dynamical activity. Symbol sequences unfold in time. Even static symbols structures written in a physical medium are not grokked as a whole, but scanned sequentially in a time-bound process. Consider eye movements during reading. When reading aloud, the mean saccade is 1.5°, and when reading silently it is 2°, with typical fixation durations in the range of 225–325 milliseconds (Rayner and Castelhano 2007). This puts the total number of fixations per minute for readers inside the upper half of the range for music given above. The same brain is at work in each.
Such considerations, along with others derived from the studies I described in the previous section, reinforce the point that when it comes to understanding our own facility with symbols, we remain woefully ignorant of basic perceptual features of the symbol systems which drive our behavior. The exploitation of empty space between what we take to be the “real” symbols, directional biases in our operations with such symbols, and the temporal aspects of symbol perception, all derive from general features of the human perceptual and motor systems which scaffold the development of more complex cultural achievements with symbols. On such a view it is no accident that nursery rhymes tend to be at an andante (walking) pace, that we learn to sing our ABCs with a nursery rhyme tune, and that recitation of times tables proceeds commensurately. (There are almost certainly other aspects of rhyme and verse that scaffold memorization for facts, but a thorough review is beside the point here.)
Our symbol structures and our cognitive capacity to use symbols are likely to be coadapted. The pace of chanting is somehow related to memorization. A sweep of 2° allows us to foveate on enough of the text to process it within in the typical duration of fixation. It will be hard to disentangle cause and effect between features of symbols and their perception, between eyeball dynamics and the cultural adoption of typefaces and printing layouts that support a literate society. Do we read at the rate we do because that’s what the typeface permits? Or do we design typefaces as we do because that is what suits our eye movements? (See Reynolds 1988.) There is almost certainly an interaction between the two. We know that the visual system is not intrinsically constrained to operating within the limits typically seen when reading. For visual search, mean saccade size is 3° and the fixation duration in the 180–275 milliseconds range, and for general scene perception the typical fixations are slightly longer (260–330 milliseconds) but the mean saccade is larger, at 4°. Further investigation is needed into why the parameter range for reading is different from the range for other visual tasks. The cognitive capacity that we blithely summarize as “literacy” has been constructed through a complex feedback among processes at multiple time scales, including neural processes that operate in milliseconds, developmental processes operating over months and years, cultural dynamics operating over decades, and evolutionary dynamics operating over multiple generations. The difficulties of understanding such complex causal interactions are enormous.
Pity, too, the chimpanzee or family dog confronted with a symbolic environment full of structural properties that humans have tuned and become attuned to. From the earliest age, the dynamics of their own perception and locomotion experiences are different. It would be something of a coincidence if human speech or human writing happened to be something they could easily process—although not a miracle, as some of the larger parrots seem to have affinities for human speech, exploited to great effect by Irene Pepperberg (1999), and most dogs have some capacity for single words and even short combinations (Pilley and Reid 2011).
Finally, now is an appropriate point to remind readers that I am talking about the perceivable features of symbols themselves, not the separate issue of how perceivable features of the things that the symbols stand proxy for get bound to those symbols. Barsalou’s (1999) sensorimotor view of concepts falls into the latter category but has little to say directly about the role that perception plays in processing the symbols themselves.
Seeing Stimuli as Symbols
At this point I imagine skeptical readers thinking that by focusing on symbol perception I have neglected the core of what it means to be a symbol user. Such a reader may be thinking that, of course, we have to perceive symbols and be sensitive to the structures in which they are embedded if we are to make use of them as symbols. And of course, this means that symbols have to be fit, in some way, to our perceptual capacities. But isn’t there still something special about the human capacity to treat something as a symbol? Isn’t this special thing what makes us interpreters and leads eventually to theory of mind, and all the other good stuff that constitutes human uniqueness?
Notwithstanding the lengthy history of attempts to teach apes to use human symbolic systems, the question of how the capacity to treat something as a symbol is scaffolded has been rather neglected. An exception is provided by Brendan McGonigle and Margaret Chalmers, who discuss the importance of manipulative skills (by which they mean, literally, manual dexterity) for effective symbol use (McGonigle and Chalmers 2002, 2006). Jean Piaget (1971) conducted his famous experiments on transitive inference using colors as symbolic of length by requiring his young subjects to pull colored rods from a box. In their pioneering investigations of transitive inference in rhesus monkeys, McGonigle and Chalmers found the use of computer touch screens essential for circumventing the limitations of monkeys that were due, as they put it, to the “serious manipulative restrictions imposed by their motor control systems” (McGonigle and Chalmers 2002, 320). Subsequently they wrote, “With these techniques, we are now in a position to evaluate whether a new cycle of causality might be created ... whereby cognitive systems are scaffolded to new heights of achievement” (McGonigle and Chalmers 2006, 263). Nevertheless, in their touch-screen-based experiments, there is a clearly preestablished distinction between symbols and other stimuli. The symbols are clearly marked both perceptually and in terms of the manipulations they afford: they are on the screen, inedible, can only be moved or removed by direct touch, and so forth. The monkeys in these studies did, in one sense, have to see the on-screen stimuli as symbols in order to figure out what to do with them. However, the symbol/nonsymbol distinction was strongly scaffolded on some very salient featural and contextual cues.
A more recent study by Schmitt and Fischer (2011) has shown that monkeys may not always be dependent on such a clearly marked separation between symbols and what they represent. When confronted with two different-sized piles of pebbles representing a quantity of nuts they will receive, olive baboons and long-tailed macaques are successful in approximately 85% of trials at choosing the larger pile. In contrast, when choosing directly between piles of nuts they will receive, monkeys select the larger pile in fewer than 70% of trials. These and similar results from other studies are usually explained in terms of weaker inhibitory control when confronted with actual food. However, when Schmitt and Fischer ran a third “food replaced” condition, where the monkeys chose between piles of nuts but, instead of receiving the nuts they could see, were instead given an equivalent number of hidden nuts, over repeated trials their performance in selecting the larger pile was statistically indistinguishable from choosing between piles of pebbles. Thus, when the nuts are merely symbols for the nuts underneath, not actually food items in the present context, these monkeys seem capable of responding to them differently than when they are the direct objects of food interest. (Recall, here, the characterization of “symbol” from the introduction, as something used as a means to orienting an audience toward, or acting upon, something other than the symbol itself.)
I take the lesson to be that cognitive capacities are scaffolded not just on perceptual features of symbols themselves, nor just on perceivable features of assemblies of symbols, but also on the larger context of affordances in which those assemblies occur. A group of nuts is sometimes just food. However, sometimes a group of nuts can symbolize other nuts. To treat the stimulus as a symbol rather than as a consumable food item is within the capacities of the monkey, given the particular context of social interactions between human and monkey in this experiment. But of course, such contexts rarely extend outside the experimental situation for the monkeys. Nothing further gets built on it, and the capacity goes nowhere. Humans, however, are surrounded by such interactions—something is now a symbol, now a consumable. A peanut can be a snack or a poker chip, or simultaneously both (literally playing for peanuts). We humans are enormously flexible in our symbolic behaviors. The exact interplay between basic biology, development, and culture in that flexibility is, I submit, largely unknown at this point.
A Developing Story
The details are largely unknown because studies of cognitive development are largely in their infancy, and the field as a whole has been and continues to be the site of considerable cross-disciplinary tribal warfare. Cognitive ethologists, comparative psychologists, and developmental psychobiologists rarely see eye to eye on theory, methodology, or results. Field studies are derided as lacking sufficient controls by laboratory experimentalists, while field biologists are skeptical of “artificial” laboratory tasks. Developmental psychobiologists are skeptical of the way questions are framed in mentalistic terms. Studies of individual “star” animals such as Kanzi (Savage-Rumbaugh 1996) or Alex (Pepperberg 1999) are questioned for their apparently anthropomorphic hypotheses and for their general applicability, while proponents of those studies question whether the training techniques used for high throughput studies of pigeons or rats are appropriate for understanding cognition. Another problem is that comparative psychology has been pursued within a framework of “trophy hunting” (Allen 2012) that attempts to match animals of adult age to benchmarks in human development, such as mirror self-recognition (Gallup 1970; Gallup, Anderson, and Shillito 2002) or the false belief task (Premack and Woodruff 1978). The approach is exemplified by such meaningless comparisons as “the chimpanzee has the cognitive capacities of a two-and-a-half-year-old.” (See Stotz and Allen 2011 for a more detailed critique of this practice.) When individual animals are used in multiple studies, only rarely is it systematically investigated how their experiences in the different experiments may scaffold performance in future experiments (McGonigle and Chalmers 2002, 2006). One problem is that systematic investigation is very, very hard. Ideally, a fully developmental approach to cognition would consider the entire range of developmental inputs as potentially relevant to the outcome of any particular experiment. However, the sheer complexity and adaptability of animals, whether human or nonhuman, to different life histories makes full investigation of this space impossible. And, it should go without saying, the full range of experiments required is not ethically practicable—not to mention that there is no consensus among ethicists even on whether it is permissible to keep animals captive for noninvasive cognitive experimentation.
Is a full theory of cognitive development therefore unattainable? Perhaps, but that doesn’t mean we can’t make progress. I hold a fairly optimistic view according to which one can proceed by trying to build bridges between parts of the space initially under separate investigation. Empirically informed philosophers and philosophically minded scientists are well-placed to build such bridges. That is why it is exciting to see philosophers of biology and philosophers of cognitive science coming together with theoretically minded scientists in this volume, to think about scaffolding across all the various time scales at which evolution, culture, development, and cognition operate.
The experiments I have described above suggest how the most fundamental and generatively entrenched aspects of perception and learning operate within an environment of culturally shaped and individually perceived symbolic structures, and how these basic perceptual and learning processes brought to bear on social transactions involving those structures may scaffold the development of more sophisticated and subtle forms of symbol perception, reasoning, and cognition. I have illustrated how studies of human and nonhuman subjects can help us think about how to tease out aspects of the relationship between biology, culture, and development. I have also illustrated how a forensic approach can uncover traces of learning histories that may not be immediately apparent from the adult performance. Scaffolding leaves marks. We are clever enough to read the signs of those earlier symbols.
Acknowledgments
I thank Cameron Buckner, Lena Kästner, Ulrike Pompe, Albert Newen, Leon de Bruin, in addition to the volume editors Bill Wimsatt, Jim Griesemer, and Linnda Caporael, for the scaffolding they provided for the marks on these pages. If they have failed to leave any marks, it is due to my own impenetrability. I also would like to thank the Humboldt Foundation and Indiana University for financial support during the preparation of this manuscript, and the Ruhr-University, Bochum, for providing a hospitable working environment during this time.
References
Allen, C. 2012. Private Codes and Public Structures. In The Complex Mind: An Interdisciplinary Approach, edited by D. McFarland, K. Stenning, and M. McGonigle-Chalmers, 223–242. London: Palgrave-Macmillan.
Barsalou, L. W. 1999. Perceptual symbol systems. Behavioral and Brain Sciences 22:577–660.
Fodor, J. A. 2008. LOT 2: The Language of Thought Revisited. New York: Oxford University Press.
Gallup, G. G., Jr. 1970. Chimpanzees: Self-recognition. Science 167:86–87.
Gallup, G. G., Jr., J. R. Anderson, and D. J. Shillito. 2002. The Mirror Test. In The Cognitive Animal, edited by M. Bekoff, C. Allen, and G. Burghardt, 325–334. Cambridge, MA: MIT Press.
Landy, D., C. Allen, and C. Zednik. (submitted). A perceptual account of symbolic reasoning.
Landy, D., and R. L. Goldstone. 2007a. How abstract is symbolic thought? Journal of Experimental Psychology. Learning, Memory, and Cognition 33:720–733.
Landy, D., and R. L. Goldstone. 2007b. Formal notations are diagrams: Evidence from a production task. Memory & Cognition 35:2033–2040.
Mackie, J. L. 1974. The Cement of the Universe: A Study of Causation. Oxford: Clarendon Press.
McDermott, J., and M. D. Hauser. 2007. Nonhuman primates prefer slow tempos but dislike music overall. Cognition 104:654–668.
McGonigle, B., and M. Chalmers. 2002. The Growth of Cognitive Structure in Monkeys and Men. In Animal Cognition and Sequential Behavior: Behavioral, Biological, and Computational Perspectives, edited by S. B. Fountain, M. D. Bunsey, J. H. Danks, and M. K. McBeath, 287–332. Dordrecht: Kluwer Academic.
McGonigle, B., and M. Chalmers. 2006. Ordering and executive functioning as a window on the evolution and development of cognitive systems. International Journal of Comparative Psychology 19:241–267.
McNeil, N. M. 2007. U-shaped development in math: 7-year-olds outperform 9-year-olds on equivalence problems. Developmental Psychology 43:687–695.
McNeil, N. M. 2008. Limitations to teaching children 2 + 2 = 4: Typical arithmetic problems can hinder learning of mathematical equivalence. Child Development 79:1524–1537.
McNeil, N. M., B. Rittle-Johnson, S. Hattikudur, and L. A. Petersen. 2010. Continuity in representation between children and adults: Arithmetic knowledge hinders undergraduates’ algebraic problem solving. Journal of Cognition and Development 11:437–457.
Pepperberg, I. M. 1999. The Alex Studies: Cognitive and Communicative Abilities of Grey Parrots. Cambridge, MA: Harvard University Press.
Piaget, J. 1971. Biology and Knowledge. Edinburgh: Edinburgh University Press.
Pilley, J. W., and A. K. Reid. 2011. Border collie comprehends object names as verbal referents. Behavioural Processes 86:184–195.
Premack, D., and G. Woodruff. 1978. Does the chimpanzee have a theory of mind? Behavioral and Brain Sciences 1:515–526.
Pylyshyn, Z. 2003. Return of the mental image: Are there really pictures in the brain? Trends in Cognitive Sciences 7:113–118.
Rayner, K., and M. Castelhano. 2007. Eye movements. Scholarpedia 2:3649. Accessed at http://www.scholarpedia.org/article/Eye_movements on May 30, 2011.
Reynolds, L. 1988. Legibility of type. Baseline Magazine, issue 10.
Savage-Rumbaugh, E. S. 1996. Kanzi: The Ape at the Brink of the Human Mind. New York: Wiley.
Schmitt, V., and J. Fischer. 2011. Representational format determines numerical competence in monkeys. Nature Communications 2:257. doi:10.1038/ncomms1262.
Snowdon, C. T., and D. Teie. 2010. Affective responses in tamarins elicited by species-specific music. Biology Letters 6:30–32.
Stotz, K., and C. Allen. 2011. From Cell-Surface Receptors to Higher Learning: A Whole World of Experience. In Philosophy of Behavioral Biology, edited by K. S. Plaisance and T. A. C. Reydon, 85–123. Boston Studies in the Philosophy of Science. Berlin: Springer.
Wimsatt, W. C., and J. R. Griesemer. 2007. Reproducing Entrenchments to Scaffold Culture: The Central Role of Development in Cultural Evolution. In Integrating Evolution and Development: From Theory to Practice, edited by R. Sansom and R. N. Brandom, 227–323. Cambridge, MA: MIT Press.