Media of Reason: A Theory of Rationality

There is no media theory.

—J. Baudrillard¹

The previous reflections on the media concept should have shown three things. First, in all of the theories presented, the respective concept of the media played a prominent role. Except for McLuhan’s theory, which fashions itself as a media theory, the media concepts, however, largely owe this role to necessities of theory design. The media theories primarily have a subsidiary function within the framework of the antecedent theoretical intentions; hereby they take on a role that they cannot just shake off, for none of the theoreticians introduces the concept of the medium on the basis of a conceptually independent media theory.

Second, because of the lack of a general media theory with basic conceptual autonomy, the media concepts presented are so influenced by the characteristics of concrete, prototypically introduced media that in the course of their implementation theoretical frictions are established that are more suited to ensure the disavowal of the media concept than to secure it new theoretical attention. In order to develop a robust media concept, it thus appears to me to be necessary to take up the risky task of developing an independent media concept on the basis of fundamental principles; here the concept will not be acquired by generalizing characteristics of the concrete media (e.g., money, language, writing), for this procedure entails the risk that the specifics of such examples in each case will preform the concept that is to be developed in inappropriate ways, and where numerous paradigmatic media are used, a conglomerate of incompatible models may become the basis of media theory. In order to avert the danger of distorting the concept from the outset by systematically orienting it in reference to particularities or inner heterogeneity that tends to develop as methodological inconsistencies, a sufficiently general concept must be chosen as the basis of a conceptual definition of media. Where the conceptual language of a media concept cannot be tailored to specific media—and it thus serves to specify an individual medium—in order to be robust, that language must be so general that the specific individual media can be described by further specifying the general concept.

In my view, a further result of the previous chapter lies, third, in the diagnosis that media theories converge by assigning specific scopes of behavior. In each of the presented theories—and the theories converge in this—the media concept serves to characterize specific scopes of possibility that are available in the form of media. Parsons, Habermas, and Luhmann conceive of media as mechanisms that open up new possibilities for interaction against the background of the open, nonspecified horizon of language; they achieve this by limiting the scope of linguistic action coordination in their specific way. Through the exoneration of the risks of the action coordination, which arise due to the unlimited horizon of language, scopes of possibility emerge that can be understood as specifications (Parsons, Luhmann) or as specifications or substitutions (Habermas) of linguistic interactions. Also from the action-theoretic perspective in accord with which Dewey develops his media concept, media appear to be sets of specific possibilities for action, which Dewey obligates art to develop. And even McLuhan’s vaguer media concept achieves its diagnostic power solely against the background of media-dependent possibilities for perception and interaction.

In light of the convergence, finally a connection can be made out between the display of the scope of action that emerges in the framework of media-integrated interaction and the possibility for understanding actions and social processes against the background of these possibilities for action. Before a precise analysis of this connection can reap the rational-theoretic reward of a general media theory, first, we are faced with the task of developing the foundation for such a theory. The first section of this chapter is thus dedicated to the sober task of formulating elementary concepts of a media theory.

Bracketing the system-theoretical approach, the short version of the background thesis of the diagnosis of convergence is:

(M1) Every medium presents a specific number of possibilities for action that arise for the actors.

Before I begin the foundational conceptual work, however, I would like to clarify which theoretical perspective I think is capable of solving these conceptual groundwork problems. This perspective comes to light when a more basic dimension of the convergence thesis is viewed than the ability to propose a concept of medium that can serve as the lowest common denominator for the (intelligible) media concepts in their various theoretical contexts. If one more specifically calls to mind the implications of the convergence thesis, it soon becomes clear that the convergence thesis provides a metatheoretical diagnosis. It maintains that the media concept plays a central role in attempts to understand processes of social interaction. The thesis is not primarily supported by material kinship relationships that exist among the media concepts; rather, it views the function that the media concept has in the respective theories as something that connects the conceptions. The theoreticians introduce the concept of the medium in each case in order to facilitate the understanding of interaction processes as processes of understanding. With the help of the media concept, social interaction processes are intelligible as communication processes. The media concept makes it possible to describe forms of interactive behavior as forms of interactive action and indeed by means of the assumption that acting social individuals understand each other reciprocally as beings that choose behavior patterns (or sequences of behavior patterns) from a shared stock of types of behavior patterns.

Because chosen types of behavior patterns such as these owe their status as (proto-)actions to the interpretations of this behavior, the theoretical perspective that I use in the attempt to develop a basic concept of media theory is an interpretationist perspective; however, as I will show, it is a variation of interpretationism that considerably loosens its connection to the linguistic paradigm. The starting point for the development of media theory is, first, the assumption that the concept of the media plays a role in the attempt to understand those social interaction processes in which language plays no role at the surface level of the interaction. Hereby there is no principal difference between the role that the media concept plays for understanding in theoretical contexts and the role that the media play for those interacting in the social processes that are to be understood: just as the social theoreticians, from their interpretive perspectives, understand the observed interaction processes against the background of a hypothetical medium, so too, those interacting socially understand the behavior of their counterparts as actions against the background of assumed behavioral alternatives, that is, against the background of medial possibilities.

If I attempt to develop media theory as an independent theory in what follows, then, in light of the interpretationist perspective, this attempt must assume a form in which it is possible to show the plausibility of increasingly decoupling understanding from language. If one begins with the familiar concept of radical interpretation in which an interpreter correlates expressions in an object language with metalinguistically articulated truth conditions and in this way formulates empirically testable T-theorems, then a procedure emerges that successively reduces the linguistic preconditions for radical interpretations so that each theoretical element that the foundational vocabulary of an independent media theory must relate to can be identified. I attempt to achieve this process in three steps:

1. In the first step I confront interpreters who are fully able to use language with nonlinguistic expressions of producers who are able to use language. In order to do this, I will revisit the context in which Dewey developed his view of media—that is, the context of communication at the level of aesthetic experience—and investigate the understanding of nonlinguistic expressions that we commonly refer to as works of art. Here I am above all concerned with showing that the understanding of works of art can be viewed as an exemplary case of the understanding of nonlinguistic expressions. Under observance of an important and in my view rightly widespread intuition, namely, the view that works of art cannot be translated linguistically, this can nevertheless be reconstructed with the tools of media theory as a genuine case of understanding, and indeed as a case of radical interpretation in which the interpreter makes use of hypothetically assumed media. Within the framework of this scenario I start with the familiar assumption: I examine a case in which art is understood and in which both the interpreter and the one that is supposed to be interpreted can use a natural language. The goal of my analysis is to show that media play a mostly unarticulated role in the interpreter’s interpretation language, but also in the aesthetic production of those who are supposed to be interpreted (pp. 143–220).

2. In the second step I want to further loosen the linguistically bound prerequisites of the first scenario by assuming a more radical situation of interpretation, a situation in which it is not clear whether those who are supposed to be interpreted even speak a language. For this purpose, I introduce a situation from field research in which two competing interpreters who are able to use a fully developed interpretive language investigate members of a fictitious ethnicity in order to find out whether those being interpreted even speak a language and which of their observable means of behavior are linguistic or nonlinguistic (perhaps artistic) expressions. In the framework of this scenario it is not only presupposed that those being interpreted do not use a language for those expressions that the interpreters are trying to understand, but it is even imagined that those being interpreted may not speak any language whatsoever. The goal of the investigation at this level is to specify the role of the media concept against the background of the problem of providing a suitable conceptual reconstruction of a situation in which the interpretation of beings who do not speak a language is carried out by interpreters who do speak one (pp. 220–233).

3. In the third step, a level should finally be reached from which we investigate interactions between beings who we presume do not use anything that we would call a language, but whom we nevertheless view as beings that communicate with one another. The question that is to be raised here is whether the tools of media theory are able to characterize a level of sublinguistic communication that can be understood as an evolutionary–theoretic link between the level of completely developed linguistic communication and mere causal interaction processes. In the framework of this scenario, media theory has to prove that it is able to avoid the circular implications of the established interpretationism; for with a view to the question of how the primacy of the interpretation for the development of language, meaning, and mind can be reconciled to assumptions of evolutionary theory, interpretationism appeases itself with the answer that all speaking beings have parents, consequently parents that possess an interpretation language. However, this raises the question whether we should assume that in the evolutionary history of humans, which must include a transition from a species that does not speak a natural language to humans who do speak one, there were parents who—in contradiction to the interpretationistic assumption—by virtue of their genetic disposition, were interpreters of their children. On the other hand, it is not easy to see how this transition can be construed as gradual, because the rationality that is imputed by the interpreters has a constitutive all-or-nothing character that eludes reconstruction as something that can become established gradually (pp. 233–240).

In contrast to Parsons and Habermas, I thus do not attempt to develop the media concept in a social context in which a language is present as a medium from the outset, which social actors can use to make agreements—for example, to treat something as money—in order, in this way, to institutionalize media, whose specific behavioral possibilities may indeed facilitate special economic forms of interaction, but that in principle do not go beyond the linguistically individualizable behavioral possibilities. In the framework of the outlined three-step process, I would like rather to show that media can be understood as sets of behavioral possibilities that do not necessarily depend on the individuating power of language.

3.2. AN INTERPRETATIONISM EXPANDED BY MEDIA THEORY

3.2.1. What Does It Mean to Understand a Work of Art?

As a first step toward expanding interpretationism with the aid of media theory, I want to show the plausibility of the view that artistic media provide behavioral possibilities, which, under certain conditions, can also serve as possible ways of individuating thoughts that cannot be individuated by means of language. Setting out from basic and widespread intuitions regarding that which is expressed by works of art, I want to show how the resources of media theory allow us to develop a view of the production of works of art and their interpretive reception that makes it possible to rationally reconstruct these intuitions. In doing this, I will attempt to develop a vocabulary of media theory, in debate with the genuine nonlinguistic, communicative forms of art, that is nonetheless sufficiently general to allow the apprehension of nonlinguistic forms of communication beyond the domain of art.

3.2.1.1. Difficulties with Intuitions

In developing our own interpretationist perspective with a view to the understanding of works of art, we are confronted with the following difficulty: the established interpretationism, which is influenced by Davidson, conceives of understanding as a process in which we correlate the statements that are to be interpreted with (metalinguistic) sentences. However, insofar as these interpreting sentences are metalinguistic, a linguistic structure is read into what is interpreted; it is initially at least questionable whether, with a view to the understanding of nonlinguistic works of art, this consequence is appropriate. For besides the theoretical questions regarding whether works of art possess a predicative structure and which role the concept of truth plays in connection with the understanding of works of art, it is not clear how an interpretationist theory can accommodate the intuition that works of art articulate thoughts that cannot be linguistically articulated.²

Yet, if with regard to this difficulty, we bear in mind, for example, the basic models that we employ to speak about music,³then it is apparent that the model of language exercises an enormous influence even on those artists and theoreticians who insist on the irreducibility of music as an independent form of expression. Strangely, these advocates of the autonomy of musical expression often fail to elude the analogy between music and language; thus, they are forced to take refuge in paradoxical formulations that maintain the independence of music from language precisely with the help of the analogy of language. For example, in addressing the question “What is music?” in 1932 Anton Webern answered:

Music is language. A human being wants to express ideas in this language, but not ideas that can be translated into concepts—musical ideas.⁴

Like Webern, Adorno is also convinced that music is based on a form of quasi-linguistic thinking, which however cannot be transcribed into language; but Adorno does not want to allow the analogy between music and language without further ado. Because language is also the medium of a subsuming form of thinking, a form of violence that Adorno had traced back to the fibers of conceptuality, music is not to be an accomplice to it. Yet, because he cannot bring himself to abandon the language analogy, he must attempt to state the putative linguistic character of music and its (simultaneous) distance from language paradoxically: “It is by distancing itself from language that its resemblance to language finds its fulfillment.”⁵A glance at a dictum of Eduard Hanslick, who is associated with the autonomy of music like no other, clearly indicates that it is not first in the twentieth century that music is deemed a language. Hanslick writes:

Music has sense and logic—but musical sense and logic. It is a language which we speak and understand yet cannot translate.⁶

Whatever allure the formulations of Adorno, Webern, and Hanslick may appear to have on a cursory reading, the notion of an untranslatable language raises the suspicion that, in the end, the formulations are attractive solely by virtue of their unintelligibility. For, does it make any sense whatsoever to speak of something as a language if what is supposed to be expressed with its help is not translatable into another language? On the contrary, isn’t it basic to our view of languages that they can be translated into one another? Can we view sounds that members of a culture that is foreign—for us—exchange among one another as a language without at the same time believing that they are in principle translatable into our language?⁷

The attempt to follow the intuition that music is an expression of an independent form of thought (and to that extent requires an independent concept of understanding) by trying to secure the independence of music through the employment of the analogy of language appears to me to result in one of two equally unattractive consequences: either we compromise our concept of language by allowing untranslatable languages (music being among them) or we understand music—in flagrant contradiction to the intuition that it is independent—as a type of deficient language, the status of which can only be clarified by relying on the earlier dignity of language. Even if the analogy to language is inappropriate and, in the cited texts, appears rather inept at plausibly showing the independence of music, the intuition that it is supposed to help express is still clear:

(I₁) Like language, music (along with other arts) provides a resource with the help of which it is possible to articulate (musical) thoughts;

a. consequently it is appropriate to believe that we can understand musical works of art; and

b. if we understand musical works of art, then we understand thoughts that cannot be expressed in language.

This intuition (I₁) expresses to a certain extent that music is not an instrument that, solely on the basis of our knowledge of the human organism, is applied to produce causally describable effects. (I₁) does not view music as a (legal) drug or psychopharmaceutical, and it insists that music has nothing to do with either Kant’s⁸or Dr. Rueger’s⁹medicine chest, because the mere pharmacological use sterilizes its capacities for interpretation.¹⁰

On the other hand, despite the capacity of music to articulate thoughts, it is not a language, for music has no predicative structure, and the endeavor to develop a grammar of music on the basis of an analogy to Chomsky’s generative grammar, such as Lerdahl and Jackendoff have attempted, creates many more problems than it is able to solve.¹¹

3.2.1.2. Works of Art as Products of Ordinary Action

The scarcely plausible consequence of reading a linguistic structure into nonlinguistic works of art can indeed be avoided within the framework of interpretationism, but only at a price: works of art are understood against the pattern of common instrumental explanations of action, and (hereby) as actions or consequences of actions that actors have carried out because they believe that the works of art are appropriate means for achieving certain (for example, expressive) goals. An analysis of this sort does not in fact structure the work of art as a linguistic expression, but it ascribes thoughts to the producers of works of art—specifically, beliefs and preferences—that exhibit the common structure of thoughts composed by language and that present the reasons for the production of the works of art. According to this perspective, understanding a work of art thus consists in nothing but identifying those beliefs and intentions that allow us to rationalize the production of a work of art.

An instrumentalist analysis of artistic action, however, clashes with the further widespread intuition that artistic action concerns a special form of intentionality. According to this intuition, artistic intuitions are indeed intentional in the sense that works of art are not involuntary expressions; but they are not expressions of intentions that those who create them have independently of works of art. (I₂) attempts to articulate some of the motifs that lie behind this often only vaguely formulated intuition.

(I₂) Works of art are intentional products of those who produce them, but:

a. The type of intentionality does not appropriately come into purview if works of art are merely understood as instruments for realizing the intentions of those who produce them; for in some way unintentionality and happenstance play a role in the production of works of art.

b. A work of art is not adequately understood if we can say “what the artist wanted to say with its help”; understanding works of art does not consist in identifying the intentions that the artist may have had. In any case, this does not exhaust it.

c. If we could say without reservation what a work of art is supposed to express, then we would not need the specific form of the work of art in order to articulate what the work of art does express. Works of art are forms of expression of thoughts that require the specific form of the respective work of art for their individuation. That is the reason works of art are not related to intentions the way means are related to ends.

(I₂) insists that works of art are indeed intentional products, but they cannot be understood according to the model used to rationalize common behavior. However, insofar as my analysis is based on an action-theoretic perspective for analyzing nonlinguistic communicative acts, the systematic problem consists in developing an understanding of such acts that is able to accommodate the mentioned intuitions. Before taking this up, however, I would like to clearly explicate the difficulties that arise if works of art are understood in accord with the common action-theoretic perspective.

In adopting the action-theoretic perspective that I have accepted in developing a media theory, one runs up against a problem: how is it possible to introduce a concept of medial action without making media theory conceptually dependent on the general analysis of action? This is problematic because the standard general theory of action conjoins actions with propositional attitudes; so in using the model to explain media theory, it appears to make the possibility of (medial) action dependent on the speech competencies of the actors. In attempting to characterize medial action as a genuine form of action, the media theory being developed here is not just an affront to system theory, but it also stands in a strained relationship to the standard analysis of action, which casts action in terms of propositional attitudes.

If, with the help of the standard analysis, we view a behavior as an action, then we start with a description of the behavior that uses intentional vocabulary, i.e., we position the observable behavior of a person P (a) in relation to her mental conditions, (b) in relation to some meaning that the activities of P have for P, and (c) in relation to the observable consequences of the activities that P performs. The practical syllogism integrates these three presuppositions of an intentional explanation in the form of an argument in which (a) and (b) function as premises and (c) as the conclusion:

(H1)

a. P intends to achieve Y;

b. P believes that X is a means to achieving Y;

c. (ergo) P does X.¹²

In the context of my reflections, two problems are connected with (H1). For one, (H1) is not suited for the analysis of artistic action because it is irreconcilable with intuition (I₂); for another, (H1) poses a metatheoretical problem, for the problem that arises with a view to the independence of the basic concepts of media theory consists in the fact that, in accord with the perspective of the linguistic turn, intentions and beliefs must be described as propositional attitudes, thus the concept of action is made conditional on the existence of the actors’ propositional attitudes.¹³However, were we to proceed from such a conception of action, then the main reason for developing the media concept here—namely, to contribute to a concept of understanding and rationality that is not based exclusively on language—would be illegitimate from the outset; for the basic motivation for incorporating media theory into the debate on the conception of rationality as a possible fundament consists precisely in expanding that fundament beyond language.

If, in order to sketch out a concept of action that brings us closer to solving both the problem of the specific intentionality of artistic action and the metatheoretical problem, we now attempt to show that it is plausible to develop a concept of action that does not entail constitutive linguistic presuppositions, then two strategies emerge. For one, it is questionable whether the fact that intentions and desires assume the form of propositions in intentional explanations necessarily means that the intentions and desires of the actor must be present as propositions. In line with this strategy, it would have to be claimed that the rationalizing interpretation of the behavior of individuals works with ascriptions that, as linguistically articulated, can only assume the form of propositions, but this does not allow any inferences about the form in which actors represent intentions and desires. In any case, I question whether it is promising to give much weight to an argument that emphasizes the “artifact character” of linguistic reconstruction that a rationalization from the perspective of an interpreter inevitably adopts, because it is not clear how strong a potential difference there is between the form of external rationalizations, on the one hand, and the internal conditions of action, on the other. Within the framework of this strategy, it is still a problem, for example, to explain how the actor, from his or her internal perspective, is supposed to be in a position to individuate different intentions and beliefs without being able to refer back to a medium for individuating the constituents of action, thus intentional states.

Within the framework of a second strategy, I would thus like to attempt to show that there are actions that are not connected with the existence of the actors’ propositional attitudes, but that can be understood with the help of the assumption that there are other media besides language that allow the ascription of intentions. To begin with, the view that the existence of intentions is fundamental for action is not controversial. However, it is characteristic for a position that strictly adheres to the linguistic turn that the intention that the actors link to their action is dependent on an ability of the actor, namely to articulate it (at least potentially) in a sentence like: “I desire p,” or “I believe p,” and so on. So this second strategy attempts to loosen up the tight connection between intentionality and language. In a first step (1) I will initially attempt to screen the arguments, above all developed by Davidson, that intentionality is dependent on language. In a second step (2) I will contrast these arguments with Searle’s reconstruction of the connection between intentionality and language; because this reconstruction depicts intentionality as a more basic phenomenon than language, it may be able to contribute to solving the two mentioned problems. On the basis of the various difficulties in Davidson’s and Searle’s positions, I will attempt in two further steps (3) to refine the view of intentionality and (4) to specify the idea of an expanded form of interpretationism.

1. LANGUAGE AS A CONDITION FOR INTENTIONALITY

In the arguments that are meant to support the thesis that intentionality is dependent on language, language plays the role of an instrument for individuating intentional states, which are only able to acquire identity within a network of other intentional states, a network whose inferential relations can only be determined by language. In an interpretationist perspective such as Davidson’s, in ascribing intentional states, language functions initially as nothing more than a behavioral pattern that is sufficiently complex to allow the correct inferences to the propositional attitudes of the speaker as long as there is sufficient information about the behavior and which actions are possible.¹⁴Language thus presents itself as a behavioral pattern, the interpretation of which enables us to make inferences regarding how propositional attitudes can be individuated in a (necessarily) largely coherent network of logical relations. We can thus ascribe the propositional attitudes to the being whose behavior we want to interpret. Certainly, for Davidson this analysis does not have the character of a conclusive argument, with the help of which it can be shown that intentional attitudes are dependent on the possession of language. In order to provide such an argument, it would have to be shown that language is the only possible medium for a sufficiently complex behavioral pattern. Because Davidson is in fact convinced that this interrelation is correct,¹⁵but he cannot prove it to be necessary, the thesis that there is no alternative to language being the medium for individuating intentional attitudes must continue to be viewed as an assumption.

The interpretationist perspective must now, however, face the fact that we, as interpreters, also manage to successfully explain and predict the behavior of nonspeaking animals by, in some manner, ascribing beliefs, desires, and intentions to them. However, because the reason for making allegations of this sort is not mere fad, but because we lack really good alternatives, the thesis of language dependence suffers a further loss of plausibility for the time being. However, Davidson’s cited article contains a further and weightier argument for the dependence of intentionality on speech.

On the basis of the holistic assumption that all propositional attitudes (and thus intentions) are dependent on a network of beliefs, Davidson claims that only beings with a concept of belief can have beliefs and that only those beings with a language can have a concept of a belief.¹⁶If we accept that beliefs are a fundamental condition for the possibility of intentions, then, following Davidson, we must also accept that only a being that can refer to the fact that it believes p—that is, that can form second-order beliefs—can develop beliefs about anything at all. For only if a being can draw a distinction between (subjective) beliefs and objective truth with the aid of second-order beliefs does it make any sense to ascribe an ability to have beliefs to this being. From the interpretationist perspective, however, we can only ascribe this ability to a being if we can interpret its behavior in the context of linguistic communication such that it masters the difference between subjective belief and objective truth. In short, only those beings that are able to contrast their subjective beliefs with others’ beliefs in an intersubjective speech practice, and in doing so refer to something like an intersubjective truth, have the ability to refer to their own beliefs.¹⁷

This argument seems much stronger than the earlier one if we can preclude that, on the basis of the assumption that all propositional attitudes are dependent on a context of believing, it begs the question by implying from the outset that the scope of possibilities only includes linguistic entities. Yet, it seems to me to be a stronger claim that all mental states that are relevant for acting depend on states whose propositional equivalents are capable of being true. In other words, it is plausible to maintain that the ability to have beliefs is necessarily dependent on the concept of belief, a concept that can only be had by one who possesses a language. But, in order to be able to have a thought, is it also necessary to have a concept of a thought? Indeed, according to Davidson, all propositional attitudes are thoughts, but are all thoughts also propositional attitudes?

So, even Davidson’s second, stronger argument is only a conclusive argument for the dependence of intentionality on language if it can be shown that a concept of intersubjective truth is the only possible way to establish the difference between subjective believing and objective validity, and that it thus presents the only possibility for second-order thoughts. However, a vigorous reconstruction of the argument provides important indications of how the thesis of language dependence can be contested. For as far as I can tell, it leaves only two options: either it can be shown that intentionality always has to be presupposed in order to explain how something like a language can develop (section 2), or it can be successfully shown that the concept of second-order thoughts is not tied to language (section 3). Within the framework of the second strategy it is necessary to show that the concept of second-order thoughts can also be introduced with a view to non-truth-conditional intentional attitudes; consequently, beliefs cannot be necessary conditions for actions.

2. INTENTIONALITY AS THE BASIS OF LANGUAGE

A prominent position that denies a necessary connection between the existence of propositional attitudes and actions is the naturalistic conception that John Searle presents in his work Intentionality.¹⁸According to Searle, actions are not connected to existing propositional attitudes but to “intentional states,” which, for their part, are prelinguistic. Searle thus reverses the relationship between intentionality and language proposed by the propositionalists: “Language is derived from Intentionality and not conversely.”¹⁹Searle shows that we must presuppose intentionality as a basic phenomenon in order to explain that mind is able to “impose intentionality on entities that are not intrinsically intentional”²⁰by ensuring that these entities deal with something or are related to something. In other words: if mind can relate entities that are not in themselves intentional to something—if, for example, it can use them for purposes of representation—then intentionality cannot be explained with recourse to relations between these entities but is the basis of the possibility of such relations.

In the evolutionary theoretic framework in which Searle develops his theory, it is clear that language must be reconstructed as a late product of evolution on the basis of species competencies. But independently of the evolutionary-theoretic perspective, which, on the basis of prelinguistic forms of intentionality, explains what would be necessary for the development of language,²¹Searle explicitly maintains the logical primacy of intentionality before language because “certain fundamental semantic notions such as meaning are analyzable in terms of even more fundamental psychological notions such as belief, desire, and intention.”²²With recourse to these “more primitive” forms of intentionality, which possess an intrinsic relationship to conditions of satisfaction, it is possible to explain how entities that do not have their own intrinsic intentionality are taken up in the service of mind. The intrinsic relationship of primitive intentional states to conditions of satisfaction is, in a manner of speaking, the model that is broadened in the development of language, imposing the same conditions of satisfaction on a speech act as those that the mental state has that is to be expressed with the utterance in the speech act.²³

It is clear that this concept entails a series of problems, problems that are primarily a result of the inverted relationship between language and intentionality. For it is of course questionable whether a being can draw on one of its intentional states without identifying this state with the means by which it draws on the state. However, it remains an open question whether language is the only means that can assume this function.

Searle appears to think the following idea can serve as a solution to the problem of individuation: while a verificationist theory of meaning only has truth-functional propositions at its disposal with which to individuate beliefs as elements of a network that facilitates the individuation of intentions, Searle attempts to individuate the intentional states directly with reference to conditions of satisfaction. Accordingly, an intentional state is a state that is characterized by states of affairs in the world that correspond to its satisfaction. That implies that for “any intentional state with a direction of fit, a being that has that state must be able to distinguish the satisfaction from the frustration of that state,”²⁴and we must assume that this differentiation is possible without language if we want to maintain the primacy of intentionality over language. Otherwise Searle’s meaning-theoretic program, which he clearly sketches out in the following lines, would be pointless.

The fact that the conditions of satisfaction of the expressed intentional state and the conditions of satisfaction of the speech act are identical suggests that the key to the problem of meaning is to see that in the performance of the speech act the mind intentionally imposes the same conditions of satisfaction on the physical expression of the expressed mental state, as the mental state has itself.²⁵

The attempt to contrast the propositional reconstruction of intentionality with a naturalistic perspective in the interpretation suggested here leaves us initially with a tattered view: from a propositionalist perspective it cannot be clearly shown how we end up with a situation in which there are beings that make intentional use of non-intrinsic-intentional entities if intentions can only be identified by means of such entities; from the naturalistic-intentional perspective, it remains questionable how mind can draw on its intentional states if it lacks the means that can only be generated if individuated intentional states are already presumed.

3. TWO FORMS OF INTENTIONALITY

However, a productive way out of this dilemma does seem to me to be possible if we differentiate at least two forms of intentionality, one that I initially only characterize negatively and describe as a form dependent on language and one linguistically independent form.

Setting out from the broad interpretation that Searle gives to his view of intentionality—namely, to characterize those mental states that are about something or that are directed toward something as intentional states—for the time being, it appears to be neither counterintuitive nor problematic, for example, to ascribe intentional states to small children, who do not yet speak a (propositionally refined) language. Nor does there appear to be a problem with sufficiently individuating these states: in a situation in which a small child stretches his hand out toward an object x, which is in close spatial proximity to a second object y, we are hardly surprised if handing the child x is noted with satisfaction, while handing the child y is protested loudly. In such cases we do not ask how it was possible for the child to find itself in a certain intentional state, but we assume that the child has access to criteria for the conditions of satisfaction of the intentional state that it has, and is thus able to differentiate this intentional state from others without being able to speak a language. So I do not see a problem in initially following Searle and presuming an elementary form of intentionality (which I would like to call A-intentionality), with states that are sufficiently individuated without language.²⁶

However, it is questionable whether the child can also behave with reference to the intentional state in which it, according to this analysis, finds itself. In other words, it is questionable whether the child chose the state or could choose it, or whether the A-intentional state befalls the child. It is reasonable to assume with Davidson that the possibility of choosing intentional states or of intentionally individuating such states is connected to the ability to draw on these states with the help of a medium like language. A theory of developed intentionality, understood in this way, would now have to show which conditions have to be fulfilled in order, on the basis of A-intentionality, to allow the determination of the competencies and instruments that are needed to develop higher-level intentionality, which includes the possibility of being able to (arbitrarily) produce intentional states.²⁷

In view of these thoughts, a preliminary criterion for A-intentional states could, for example, be as follows:

(A1) A mental state I_aof a being B is an A-intentional state if and only if

a. through I_a, B is disposed to differentially respond to its environment; and

b. I_ais individuated in a way (e.g., causally) that precludes it from being steered by B as long as B only has A-intentional states; and

c. the identity of I_a, i.e., the content of I_a, is determined as long as the conditions of satisfaction of I_aare sensuously present to B.

Ascribing A-intentional states is legitimized from the viewpoint of an interpreter if the interpreter’s explanation of the behavior must assume that the being that is to be interpreted (B) possesses nonlinguistic representations of existing and not-existing states of affairs, and these representations steer the being’s activity. Thus, among the linguistically independent states, these states are the ones that we ascribe to beings because we can only plausibly explain their behavior by maintaining that they are able to differentiate states of affairs in reference to whether they are conditions of satisfaction for their intentional states or not. Here, however, A-intentional states are intrinsically intentional. They are not intentional by virtue of being self-interpretations. They have content that does not require an interpretation in order to be individuated. Instead, in the context of environmental conditions, it requires a functional role.²⁸A-intentional states are thus states that befall the being that has them. We must assume that beings with A-intentional states possess a representational machinery that is indeed able to produce representations from states of affairs, but that does not produce representations of such representations. A being whose most developed mental states are A-intentional states thus does not possess the ability to refer to these states; for the content of A-intentional states is never another intentional state. Along with Dretske and Millikan, one can suppose that A-intentional states function to show something, be it inner states or states of affairs, but the fact that a being, because of its history of interaction with its social environment, makes use of suitable means does not result from knowledge that these means are suitable, a knowledge that would be accessible to the being as explicit knowledge. Rather, it is warranted by functional mechanisms. If a being finds itself in an A-intentional state, I_a₁, there are no reasons for the change to a state I_a₂, but only causes. Here it is not precluded that these causes play a weak normative role that can be completely explained with functionalist concepts, and that have something of the status of needs. In that, A-intentional states occupy a position between those mental states that we call perceptions and those mental states that we call thoughts. Like perceptions, they have an intrinsic relationship to content that we can imagine being conveyed via functional mechanisms; like thoughts, they are individuated in normative relations, however, in normative relations that can be completely naturalized.

But even if one accepts this characterization of basic intentional states and in doing so agrees with Searle insofar as one is ready to accept that there is a foundational level of intentionality that is independent of language, it is clear that, on the basis of intrinsic intentionality, complex phenomena like artistic action cannot be reconstructed. For artists work precisely against the background of alternatives that are alternatives for them; in a way that is to be more precisely explained, they have reasons for developing a work of art in one way and not another.²⁹

If one wants to accommodate this fact in a theoretically suitable manner—that means in a way that makes it possible to maintain intuitions (I₁) and (I₂)—then it is necessary to provide a reconstruction of higher-level intentionality that does not imply that this higher-level intentionality can only be achieved if beings acquire linguistic competencies. The higher-level form of intentionality must rather be construed such that the ability of a being to relate to its own mental states becomes plausible; here one can assume that Davidson’s demand for a second-order concept indicates the specific form by which the higher-level intentional states are reached by linguistic means. Intentional states that are independent of specific linguistic means must thus be characterized as second-order intentional states; because of this attribute, they can be actively adopted. In order to be higher-level intentional states, mental states that can fulfill this demand thus must certainly fulfill the following criteria:

(HI) A mental state I_hof a being B is a higher-level intentional state if and only if

1. I_his individuated in such a way that it can be steered by B; and

2. the identity of I_h, i.e., the content of I_h, is determined as long as

a. B draws on an A-intentional state with the help of I_h; or

b. B assigns I_ha position in a network of higher-level intentional states, some of which refer to A-intentional states.

Schema (HI) expresses in a very general form what we think of thoughts; thoughts are products of thinking. Thinking is (in any case, largely) a conscious, active activity; its products have an identity because of the fact that they are related to inner representations or stand in certain relations to other thoughts.

If we provisionally accept that Searle’s suspicion is correct, i.e., that the concept of meaning can be analyzed in more basic psychological concepts like the concept of desire, and we adopt this analysis for the construction of the level of A-intentionality, then we are faced with a problem, namely, of how to make the transition from the level of a causal reconstruction on the level of A-intentionality to the level of reasons that integrate the sphere of higher-level intentionality. If it is appropriate to speak of two levels of intentionality, and we assume that only the basic one is a biological phenomenon, then how, precisely, do the two levels connect, and how can we explain the transition between them that linguistic beings have obviously achieved? In principle, the way that I initially imagine this is as follows.³⁰

In the above-described situation, the child finds itself in an A-intentional state (I_a[have x]), which is related to the possession of an object x; here the state is indicated by the outstretched arm and the noises the child produces. The desire that we, from the interpreter’s perspective, ascribe to the child is fulfilled if the child, for example, can put x in its mouth. If we assume that the child in the situation interacts with people who possess a form of language, and that in situations like those above accompany the passing of x or y with gestures and noises that the child itself can produce, then the child can correlate his A-intentional state with these activities and adapt these as instruments to articulate desires. The A-intentional state is hereby broadened to include a linguistic behavior that, in a benevolent social environment, serves as a relatively successful instrument for satisfying desires. For the construction of the higher-level intentionality, it is important that the linguistic activity (“x”) does not belong to the conditions of satisfaction of the A-intentional desire. Because the expression “x” has an instrumental character, and does not intrinsically satisfy the desire, it can be used for a correlation that can exist alongside the world-to-world direction of fit of the articulated desire: an inner-world-to-world direction of fit. If, in conformity with Searle’s theory of meaning, we at least assume that “x” can be provided with the same conditions of satisfaction as the A-intentional state, then “x” can be placed in a double correlation, namely, one to empirically having-x, and one to the desire to have x (I_a[have x]).

Among the obvious preconditions for expanding A-intentional states via intrinsically non-desire-fulfilling occurrences are the interpreters of the behavior of a child, who continually form hypotheses about which intentional state the child is now in. Another precondition is the sufficient constancy of those linguistic occurrences that the child is supposed to link to its respective states; those occurrences must be sufficient for the child and allow frequently correct ascriptions of its intentional states. If these preconditions are fulfilled, we can assume that the child will stabilize a number of relations between A-intentional states, acts of articulation, and desire fulfillment (I_a[have x] ↔ “x” ↔ having-x, I_a[have y] ↔ “y” ↔ having-y, etc.).

In order to plausibly develop higher-level intentionality, it is now imperative that the child not only brings the acts of articulation into an (instrumental) relation to its A-intentions, but also—as noted—that the interpreters react with sufficient constancy to expressions of “x,” as if the intention exists, I_a(have x). By virtue of the fact that the interpreters hold constant the relation between relata 2 and 3, they offer the child the possibility to observe the relations of the first two relata based on the expression. What exactly does that mean?

By expanding the relation between A-intentional states and their conditions of satisfaction to include nonintrinsic desire-fulfilling activities, an element is introduced to this relation that is sufficient for A-intentionality, which, since this is aimed at satisfying desire, initially has a primarily instrumental character. From the perspective of the interpreters, however, it takes on the character of a (quite reliable) indication of the existence of certain intentional states. If we assume that children expand their articulations spontaneously by varying or combining modifications, and we further assume that in the social environment of children, certain of these expressions are taken as an occasion to treat children as if they had intentions that correspond to the interpretations of expressions by adults, then the relations are stabilized between varied expressions and the states of affairs that the social environment brings about in reaction to the expressions. Because children behave in a manner that indicates they evaluate these states of affairs, as described above, one can assume that they balance these with their A-intentional intentions.

If we now expand the possibilities for expression so that the children’s practices of varying articulations are subject to limitations by virtue of the fact that the adults only accept a subset of the variants, and in this way something like rules of composition are stabilized, then a plausible case can be made that, with dependency on the differentiation of the levels of articulation, the ascription of refined intentional states becomes possible. Now, if expressions, by virtue of their interpretation in a social environment, lead to children being treated as if they had the intention that the interpretation is based on, then any expression of a person that fulfills criteria that are to be more precisely explained can also come to indicate the corresponding intentional state of that person. However, the person does not have this intentional state in the same way as she has an A-intentional state. For people who can achieve that scope of the medium of articulation which is able to be linked to an established practice of interpretation can individuate intentional states that they do not simply adapt, but that they in a certain sense create. But how?

I initially assume that at the level of articulation another form of contingency is possible than at the level of A-intentional states, one that arises from the compositional structure of the articulation. Here nothing more is meant than that the tokens of the established practice of articulation can be described as composites of a finite number of articulation types. Further, I assume that the interpretation practice achieves this contingency for the level of ascribed intentional states as the interpreters ascribe intentional states to the articulating individuals; here the degree of differentiation is correlated with that of the expression, and the person is treated in accordance with this interpretation. With a view to the person being interpreted, I assume, third, that the person learns to understand her own (spontaneous) articulations by the interpretation practices of others; that means she learns to understand them as a symptom of an intentional state that she herself has (in agreement with the interpretation practices of others). To the degree to which the articulation practice is oriented on experiences of ascribing intentions, the medium of articulation becomes a medium for individuating intentional states, which the articulating person learns to ascribe to herself. Here, A-intentional states take on the function of a screen, against the background of which the higher-level intentional states gain relevance as compatible or incompatible with the A-intentional states.

Higher-level intentional states are then, however, only able to be individuated through the differentiations that are possible in the medium of the expression. The medium in which this differentiation is made becomes an apriority of intentional states, which can only be individuated with its help. It thus holds for higher-level intentions that they are dependent on social media, because they are dependent on the possibilities that such media offer for ascribing intentions. If one accepts this analysis, then the development of a world of higher-level intentions fully accords with Davidson’s assumption that having or ascribing (certain) intentional states is dependent on the potential for differentiation of the medium on which the complex behavior that the interpreter interprets is based.

The solution that is proposed here, which sets out from Searlean starting points, finds its way to Davidsonian consequences, and needs to be further worked out, shifts the problem regarding the priority of language or intentionality to the problem of showing the plausibility of the transition between these two forms of intentionality. However, this problem might be solved if we introduce a social practice of interpretation, which I, for the time being, assume here.³¹It remains open—and that is a desired consequence of my deliberations—whether a higher-level intentional state can only be identified by linguistic means, as the example of language acquisition seems to suggest, or whether other, nonlinguistic means could also assume this function. Before exploring this, however, it should be determined whether Searle’s naturalist theory of intentionality—which I have thus far drawn on only as a theoretical background in the reconstruction of the fundamental level of intentionality—might not also be suited to allow a reconstruction of higher-level intentionality of which we understand art to be an expression.

4. DOES SEARLE’S THEORY ALLOW US TO UNDERSTAND ART?

In the previous reflections I have attempted to show that it is plausible to use Searle’s naturalist interpretation of intentionality as a starting point for the development of a form of intentionality, which, in accord with Davidson’s postulates, is dependent on the fact that beings that develop this form of intentionality appropriate a repertoire of refined articulation possibilities in a social context. Here, however, it remains an open question whether Searle’s theory might not even provide the means with the help of which we could come to an understanding of art that harmonizes with our intuitions. After all, Searle’s theory promises to decouple language and intentionality, since language is not a condition or a prerequisite for intentionality. Nonlinguistic forms of expression like art, one might surmise, must then exist in a relationship to the underlying intentionality, which is analogous to linguistic utterances, and it must be possible to reconstruct them independently of language, i.e., as genuine intentional phenomena at the level of conditions of satisfaction.

In order to check this, I would like to turn back to the problem described above, namely, the problem of understanding works of art as products of ordinary action. If “understanding an activity as an action” simply means “describing an activity as intentional,” then, if one wants to retain this schema for artistic action, it is necessary to identify intentional states that are supposed to be the reasons for the action that is to be explained. In applying a rationalization of action to artistic actions, it would then be necessary to assume something like the following form, as a modification of (H1):

(H1.1)

a. P intends to express Y.

b. P believes that K is a means to express Y.

c. Therefore, P produces K.³²

If we attempt to link (H1.1) to the above reconstruction of the Searlean analysis of meaning intentions,³³(H1.1) has to be annotated as follows. First, Y must be characterized by certain conditions of satisfaction; second, P must believe that K has the same conditions of satisfaction as Y. These conditions must be fulfilled if we are to be able to maintain the core idea of the Searlean theory of meaning—namely, that mind, in completing an act of expression, “intentionally imposes the same conditions of satisfaction on the physical expression [K] of the expressed mental state, as the mental state [Y] has itself.”³⁴However, as clear as this analysis appears to be at the outset, on closer examination it is quite confusing. It is clear that artistic actions seldom have the character of expressive, directive, commissive, or declarative acts. In any case, such acts are not typical of artistic action. (H1.1) must thus be interpreted in terms of expressive acts. Here it is initially questionable how P can come to believe that K is a means to express Y. In contrast to the four mentioned types of speech acts, this belief cannot take recourse in intentionality having a word-to-world or a world-to-word direction of fit, and thus in conditions of satisfaction that are intersubjectively observable. An expression differs from an utterance about an observable state of affairs, whose meaning intention can be related to observable conditions of satisfaction, that is, to the claim that it is true, which is satisfied if the belief that is expressed is fulfilled. For an expression, however, this possibility does not exist. In expressive acts, the belief that K is a means to express Y refers solely to the intention that K ought to be a means to express Y. Only the intention to express Y is necessary to determine what is considered an expression of Y. It follows that in the expressive speech act “to believe that K is a means to express Y” is the same as “to intend that K is a means to express Y.”³⁵However, if intending that K is an expression of the Y-state is sufficient for K to be an expression of the Y-state, then the conditions of satisfaction of the expressive intention are self-fulfilling. However, as a consequence of that, such self-fulfilling intentions—which, in interaction with desires, are supposed to ensure that links can be made between the rationalization of action and the standards of rationality of the interpreters—can no longer play an informative or explanative role. For a reconstruction, this structure would revert to an intentional analysis in the following form:

(H1.2)

a. P desires to express Y.

b. P desires that K expresses Y.

c. Therefore, P produces K.

Schema (H1.2), however, now states nothing but that P produced K because P has desires that lead P to produce K. “Explanations” of action of this sort are so free from restrictions that, with their help, any behavior can be interpreted as being steered by desires: the cat is on the mat because it wants to sit on the mat or because it wants its sitting on the mat to express a protest about the absence of the person with whom it shares an apartment. In (H1.2) the constitutive function of the conditions of satisfaction can no longer be regarded as a screen against which the individuating of expressive intentional states occurs because, from the perspective of the interpreter, these conditions can be arbitrarily fulfilled. Because there are no restrictions on what can be inserted into (b), (H1.2) can provide arbitrary explanations, which above all share the following characteristics: they are neither informative nor do they clarify an action with a view to intersubjectively comprehensible standards of rationality. Even if Searle’s analysis of expressive action may be correct, which, in view of the above-noted consequences, is not easy to believe, his analysis does not provide the theoretical means needed in order to understand artistic action. If artistic activities beyond the parameters of (H1.2) are to be explained as actions, it can be assumed either that, with a view to the reception by third parties, beyond the bare meaning intention, the envisioned success of an expression can provide reasons for the choice of a certain means of articulation, or other relations, which limit choices between the content of the meaning intention and the means of the expression, can be found.

In the face of this shattering result, let us initially once again call to mind the problem: (H1.2) places an artistic activity in the context of a meaning intention (Y), which P connects with K, a work of art. Premise (a) indeed fulfills a basic prerequisite for securing artistic action the status of action—the activity is described as intentional. However, it remains questionable how premise (b) is more precisely to be analyzed; this premise is to ensure that an action is able to be related to reasons; further, the explicative character of the rationalization of the action is dependent on it. (H1.1)(b) confronts us with the difficulty of developing a precise view of the belief that artwork is a means for expressing content; for if (H1.1) is supposed to take on the function of rationalizing action, it must be assumed that a robust connection can be made between the products (X) and the assumed goals of the articulation (Y). But what type of relation could that be?

If we assume a strong relation, we must demonstrate the validity of relations between the goals of an expression and the means of the expression, such as the paraphrase or the translation. First of all, however, such strong relations are problematic because it is maintained, for good reasons, that music—as a product of the action of composition—is “essentially untranslatable,”³⁶and thus it is impossible that there could be a propositional equivalent of what the music intended.³⁷If the thesis of the untranslatability of music is to be correct, then, on the basis of the symmetry of the translation relation, there cannot be music-extrinsic intentions that might be able to provide details sufficient to explain the structure of musical works. However, if the assumption that music is essentially untranslatable should turn out to be false, then there is a second question about the rationality of the action of the person who is expressing herself, namely, why P does not simply articulate Y (linguistically). The fact that P articulates K in the face of the possibility of linguistic equivalents could only be explained—absent motives such as that one simply has fun articulating K—if Y cannot be articulated by P except with the help of K. However, if that were the case, then in the rationalization of the action, Y could not be individuated independently of K and would have to be replaced by K. Doing this, however, the explanation of the action would become tautological.

If, in the face of the problem with strong relations, one suggests instead that the relationship between the goal of the expression and the means of the expression be determined with the help of soft relations (such as similarity or “kinship”), then, as a result of the ambiguity about what we might put in the place of Y, the explanatory character of such a rationalization of action disappears; for as a consequence of the ascription of soft relations, the definition of the possible goals of the expression would be subject to so few restrictions that a large number of possible applications of (H1.1) would be placed next to one another, lacking any criteria. In contrast to this, the plausibility of the rationalizations of action according to (H1) is not due to soft relations, but to the fact that for X, in (H1), only truth-conditional sentences, i.e., sentences that are open to intersubjective evaluation, can be employed; consequently, the link to the interpreter’s standards of rationality can be secured. However, if we now neither possess strong relations for the connection between K and Y nor are able to secure the link to the interpreter’s rationality standards with the aid of soft relations, then we must no longer rely on (H1.1) as the schema for understanding artistic actions or—and this appears to me to be a much higher price—we must no longer ascribe to artistic activity the status of action. For if characterizing an activity as an action requires that the beliefs that we ascribe to actors in potential rationalizations of action are truth conditional, or at least that they can be tested for appropriateness, then it is not clear how a modification of (H1) can lead to a rationalization of artistic action that can fulfill this condition.

In my view these deliberations bring us to the point that it is necessary to replace the action schema (H1) with a schema that allows artistic actions to be explained in a manner that need not assume (truth-conditional) propositional attitudes. Such a schema thus must systematically anticipate that an artistic intention can only be individuated in the medium in which it is articulated.

In principle, once again it is possible to conceive of two strategies that allow an artistic activity to be described intentionally. While Habermas, with all the well-known problems,³⁸has gone the route of introducing a truth-analogous validity claim and characterized sincerity as a validity claim that is characteristic for artistic actions, I would like to try to show that artistic action—more fundamentally than in the perspective of a truth derivative—can be understood as a form of action that, by means of a medium, individuates constellations that can be understood as offering a possibility for individuating higher-level intentional states of the recipients. Thus, the view that I would like to bring into play is that by specifying the idea of an intrinsic connection between media and the goals of action that are achieved with its help—which Dewey had quite vaguely formulated³⁹—media can be interpreted as “instruments” for individuating higher-level intentional states; with the help of such instruments, these intentional states can also be articulated. However, that is initially a very abstract and provisional formulation, and it must be further developed. Here it is helpful, first, once again to affirm the presuppositions that are involved in rationalizing action with the help of propositional attitudes. Among these presuppositions are the interpreter’s knowledge of a language and the assumption that those being interpreted possess a language, possibly a different one, but one that can be translated into the language of the interpreter; with the help of this language, those being interpreted might individuate intentional states and order them into a network of beliefs and desires. In connection with Davidson, it must be assumed that principles of basal rationality span and organize this network.

As a consequence of the assumption that the identity of intentional states arises in a network that is fixed with the help of principles of basal rationality and that an elementary logic is a part of these principles, the entities that are organized in this network must be truth-apt; it is precisely this characteristic that cannot be transferred to artistic intention. It is also the lack of this characteristic that hinders the transfer of a theory of meaning based on conditions of satisfaction; for there is no analogue in artistic acts of articulation to the truth aptness of propositional attitudes, along with the individuation of the propositional attitudes, that secures the connection to a logic of duties which those that are being interpreted have to follow, at least in the perspective of the interpreter. Composing a bar of music—even in highly regimented music—does not render the composer a responsibility similar to that rendered to the speaker who expresses an assertion.

In short, the theoretical situation in which we find ourselves after this look at the conceptual tools that the established theory of mind and theory of meaning provide us for reconstructing the understanding of nonlinguistic intentional expressions is as follows:

1. Within the framework of the standard analysis of action, the interpretation of nonlinguistic works of art as products of common action faces the following difficulty: in accord with this interpretation, one has to understand works of art and the intentional states that they express and that are the cause for their production to be linguistically individuated states. Doing this, however, makes analysis irreconcilable with intuitions that we rightfully have about the understanding of works of art, namely:

a. Works of art are intentional products of their producers.

b. Works of art are not completely understood if one identifies the intentions that an artist intended to achieve in producing them.

c. Works of art articulate thoughts that cannot be grasped by a different means than the one in which the work of art articulates the thoughts—works of art are not translatable.

There is thus a need for a theory of intentional states that are not individuated (in a [self-]interpretation) by linguistic means.

2. In connection with Searle (and in connection with the functionalist conceptions of intentionality), a concept of nonlinguistic intentionality can be developed using concepts of a basal intrinsic intentionality. With the help of this concept it is indeed possible to overcome deficits in the developmental history of interpretationist theories of intentionality, but the concept of intrinsic intentionality is not suited to allow an intentional description of artistic actions and to open dimensions needed to understand it.

3. Thus: the type of intentionality that it is necessary to apprehend theoretically is a higher-level form of intentionality, which is at the same time not a linguistically individuated form of intentionality. Thus, in conformity with the intuition that artistic media are means of artistic thought, one should, in accord with those interpretationistic assumptions that remain fundamental for the reconstruction of higher-level forms of intentionality, attempt to develop a theory of nonlinguistic thoughts.

The following section is devoted to the attempt to meet these theoretical demands. In the first step, which is here to be made with a view to a theory of nonlinguistic thoughts, I start from a familiar scenario: two persons with common linguistic competencies discuss a nonlinguistic work of art.

3.2.1.3. After the Concert—An Alternative Analysis

Let us assume the following: after having spent an evening at a concert with someone we hardly know, in the bar after a long period in which the topic strangely did not come up, the topic then turns to the music that we heard just under two hours previously. My question is now the following: Under what conditions would we be ready to say that our companion (P₂) has understood the music that we heard together?

Now among the minimal conditions for confidence that P₂has an understanding of this sort is surely that, during our conversation, P₂can refer to the various works that we have heard. For example, if we are speaking about the second piece of the evening, then we must expect that P₂can refer to the characteristics that make the second piece the specific work of art that it is. In short: the person must be able to refer to the work of art as a specific unity or, more precisely, she must be able to refer to the characteristics that are constitutive for the identity of the work of art, because no one can understand something that they cannot identify. This prerequisite need by no means be fulfilled using the language of music studies; perhaps it need not even rely on a language at all. If, for example, with the use of singing, gestures, or sketches, P₂can refer to the characteristics of a piece of music that make that piece of music a piece of music (in the context of the history of music), then the person has fulfilled a central prerequisite for us to be confident that she has understood the piece of music.⁴⁰In any case, we will have considerable doubt that P₂understands a work of art if P₂’s identifying references to the work of art are so unspecific that P₂’s “descriptions” of the characteristics do not identify an individual work of art but a set of such works, and in our case, for example, she interprets the last movement of the first piece and the first one of the second as a two-movement work.

To guard against a misunderstanding: what we expect from a person who fulfills the minimal conditions for understanding a work of art is not only the ability to refer to a work of art as an object, as is possible with the help of proper names or identifying descriptions (“Intégrales” or “that music piece for small orchestra and drums that Edgard Varèse completed in 1924”). Rather, it is a matter of the ability to identify the work of art in relation to its qualities that can be experienced. But even the reference to the characteristics that can be experienced is not sufficient. For it is not about some identification method with the help of which the work of art can, without a doubt, be filtered from the mass of other works of art on the basis of its empirical characteristics—the way the criminal suspect, with the help of a fingerprint, a DNA sample, or a graphological analysis, can be separated from a set of all potential perpetrators. It is rather about identifying the work of art with reference to those characteristics that are relevant for it as this work of art. But which characteristics are those? They are the characteristics that we pay attention to if we, for example, compare the following identifying references to acoustic events:

a. “The piece that I mean is the only one that brings about a feeling in me of cold abandonment.”

b. “The piece that I mean consists of two-dimensional crescendo background noises, which sound like a distant, threatening razor, and of light recurring beats as if one were pounding on tin.”

c. “The piece that I mean begins “Da-da-daa-di da du di da.”

d. The piece that I mean begins

While in (a) reference is made to a subjective association that is not accessible to intersubjective verification and thus hardly offers a suitable indication of the search for the piece, (b) makes reference to the sound structure of the piece, which is described with the aid of comprehensible music-extrinsic sound experiences. Reference (c) presents structural characteristics of the piece that place the listener of the sample in a position to identify the piece whose characteristics are exemplified, as long as the sample does not apply to numerous pieces, and (d) identifies Beethoven’s Piano Sonata, Op. 2, Nr. 1 in the form of the index of a volume of sheet music.

The thought that I would like to bring into play in order to more precisely describe the abilities that a person must have before we have confidence that she understands a work of art consists, in short, in connecting this ability to the fact that, in the act of identification, a set of characteristics must be referred to that were relevant for the production of the work of art and are relevant for most of its characteristics that can be experienced. The work of art should not be identified on the basis of contingent characteristics that are external to it, but on the basis of characteristics that it displays because it was made in a particular manner. If the identification is successful in this way, then a recipient refers to the identity of the work of art, which it has by virtue of the fact that it is a composed unity.⁴¹For its part, this identity, which I would like to call the compositional identity, has its own presuppositions, which I would like to more precisely explain by once more turning to the continued discussion after the concert.

After a heated debate about the interpretation of one of the pieces performed, our concert companion notices with some indignation that we have spent the past few minutes discussing two different pieces of the evening: she in any case meant the piece “that confronted two themes with each other, where the first one in kernel consisted of three motifs and began with a terse prelude. The motifs of the first theme in the course of the piece hardly varied, quite in contrast to the one motif from which almost all the material of the second theme was developed.” If, on the basis of this description, the misunderstanding can be overcome, then it is because our discussion partner can identify the piece of music that we are talking about with the help of attributes that it has before the background of possible musical attributes; further, she supposes that we are all familiar with these.

Implicitly, according to the following pattern, our discussion partner presents us with the characteristics of the piece of music against this shared background: the motif could consist of 32nds, 16ths, 8ths, etc.; it could be intended that these are played legato, nonlegato, staccato, etc.; the notes could progress diatonically or chromatically, etc.; and the motif that I am speaking about consists of diatonically progressing eighths that are intended to be played staccato, etc. The compositional identity of a work of art consists insofar in the sum of the attributes that a work of art has against the background of an ordered set of attributes that it also could have had. Those characteristics that are relevant for the compositional identity of the piece, however, are not arbitrary characteristics but musical ones. Initially that sounds trivial. However, on closer examination, it is clear that recipients that are able to refer to such attributes cannot be relating to intrinsic attributes of objects or events that physically actualize a work of art (for example, sound experiences); rather, they refer to attributes that such events have when “described” in some way. As a listener that has been socialized in a musical practice of listening and/or of making music, a recipient identifies a piece on the basis of a presupposition that is so obvious and for which there is subjectively no alternative; consequently, it easily escapes theoretical attention. She works with the presupposition that the acoustic experiences that were the cause of her perception during the concert were musical events. If such a listener identifies a piece of music, she does not speak of acoustic events (as acoustic events), but of (acoustic events as) musical events. In doing so, among other things, a listener like this makes use of the vocabulary on the right side of table 3.1.

Competent listeners of course use concepts, beyond the musical vocabulary mentioned, that refer to the organization of the musical material in a piece: they identify a piece by virtue of the fact that it organizes material that they regard as musical material in a certain way. In Davies’s words, “if music is organized sound, to hear music as music is to hear music as displaying organization. To hear music as such is to hear it in terms of the principles of order that give it its identity as the music it is.”⁴²In a rather technical formulation, we could say: The compositional conditions of identity for a work of art K are conditions that are fulfilled in an ordered set M by possible observer-relative attributes. We identify K by characterizing K as something that is generated by bringing about some of the possibilities mentioned in M in a specific way. K thus has its identity relative to the possibilities for differentiation that exist in M. In the face of the mentioned reflections on how works of art are identifiable with the help of attributes that are able to be experienced, we are approaching an answer to the question about which attributes assist us in identifying works of art as works of art. For we could say that these attributes are not intrinsic attributes of the events or the objects that works of art bring about, but are observer-relative attributes; here, though, they are not observer-relative attributes in general, but those observer-relative attributes that are amenable to intersubjective assessment. In the face of this clarification, my thesis is now:

Table 3.1

Physical properties	Musical properties
Frequency of the acoustic events	Pitch of sounds
Amplitude of the acoustic events	Volume, dynamics of sound gradients
Temporal sequence of the acoustic events	Tempo, meter, rhythm, agogics
Duration of the acoustic events	Relative duration
Frequency spectrum of the acoustic events	Sound, harmonics

(M2) If M is an ordered set of attributes, E_i, and it is the case

a. that there is a limited set of those action types for which every instance of the action type under consideration brings about intersubjectively observable elements of a class of events or states of affairs in the world, and

b. every one of these states of affairs in the world actualizes one of the attributes E_i, then M is a medium.

If the compositional identity of a piece of music thus can be provided by characterizing a medium and providing “principles” according to which it is organized, what should motivate us to take the ability to identify a piece of music as an occasion to develop a specific theory of nonlinguistic understanding or even of thinking? Initially, there is no reason not to presume that it is clearly possible to refer to the identity of a work of art by linguistic means, and there is perhaps also no reason not to presume that composers produce the identity of a piece of music by treating musical material in accord with certain principles. Doesn’t the discussion of nonintrinsic attributes that characterize the material, but even more, the discussion of the “principles,” suggest that composers carry out their work on the basis of a musical practice that is structured by linguistically articulated norms, that is, by rules? In short, isn’t the musical practice in which musical pieces are composed and received a thoroughgoing linguistically structured practice in which things or events are what they are because there are constitutive rules that declare them to be what they are within the framework of this practice?

We could say a musical practice functions like a game. Things or events with certain attributes serve as pieces that can be moved according to certain rules; observers of the activities that actualize (or carry out) the game understand a match if they are able to describe the progression of activities as an actualization (or realization) of a sequence of moves that aims to achieve a certain goal of the game. So what reasons are there not to say that composing a minuet is in principle nothing other than achieving a goal in the game of music? What reason is there not to say that Mozart, who composed the piece that our discussion partner described so indignantly, followed the goal of composing a piece in the sonata form in which a multiform, but essentially static, theme is contrasted with a mono-motif, highly malleable theme?⁴³The portrayal thus far suggests that artists make plans that they individuate linguistically; here, they view artistic, including nonlinguistic, media as means to achieve these plans. But accepting this analysis brings us into conflict not only with the aforementioned intuition (I₂), but also with the facts of music history.

With a view to the Mozart piece under question, in alignment with the analysis just provided, it would seem obvious that we should identify the piece, inter alia, through its attribute of exemplifying the sonata form. And, given the knowledge of the person who composed it, it would seem obvious that we should ascribe him with the intention of composing a piece that complied with the sonata form. But, first, Mozart could not have had this intention; second, knowledge of the concept of the sonata form cannot be a condition for understanding a piece of music that complies with it if we want to avoid the consequence that Mozart did not understand his music, for Mozart was simply not capable of analyzing his music with concepts of the sonata form.

Moreover, it might be said, Mozart could not have analysed much of his music, because the theoretical description of sonata form, the structural type given life in his music, was offered by musicologists only after his death. If anyone understood music, surely it was Mozart? Yet his understanding was rather applied [than] bookish.⁴⁴

Mozart’s music looks like music that is generated by molding a certain material according to given rules or prescripts, but Mozart did not know these rules, so it seems nonsensical to ascribe him with the intention of complying with these rules. On the other hand, it is also clear that composing music, which implements complex organizational patterns, can hardly be explained in reference to dispositional concepts, and it requires a reconstruction in concepts of high-level intentionality and in concepts of orderly thinking. And because it does concern orderly thinking, this suggests that we bring the concept of the rule or the prescript into play as that element on which creative thinking can orient itself.

Precisely in the face of these reflections, one does not want, with no further ado, to accept the demand to set aside the concept of a rule as a theoretical instrument for analyzing aesthetic thinking; indeed, it is perhaps not fully explained exactly how the concept of a rule is to be analyzed in this context. For one could maintain that Mozart could not follow any linguistically formulated rules, but that he followed an implicit rule or prescript, a rule he could not have explained offhand but that he could act on. But then, what should an analysis of implicit rule following look like with the help of which it is possible to understand the relationship? If we first take up the reconstruction of the more clearly laid out case of explicitly following rules, then in essence, we encounter the following criteria:⁴⁵

(RF) It holds that a person P follows a Rule R in an action A if and only if

a. R is a normative sentence that says which attributes A should have;⁴⁶

b. P is acquainted with R, i.e., R was expressed to P, or P herself expressed R;

c. P understands R and can explain R, i.e.,

i. P understands the normative character of R (the world-to-word direction of fit of R), and

ii. P understands the content of R, i.e., P knows how A must be so that it fulfills R.

d. The content of R provides a reason (and a cause) for P to carry out A in the way demanded, i.e., P accepts R as a premise of his practical reflection.

e. It is possible for P not to comply with R; i.e., the fact that P complies with R results from the fact that P accepts and wants to comply with R.

f. P acts in compliance with R.

If one is not able to analyze Mozart’s composition of music in the sonata form as specified in (RF) and thus hopes that it is possible to show it to be comprehensible as a case of implicit rule following, then the following question arises: Which of the conditions mentioned in (RF) ought to be given up on or modified in sketching out the concept of implicit rule following?

Conditions (a) and (f) are, of course, indispensable, for the fact that a person acts in such a manner that his action complies with rule R is indeed the initial condition for assuming that a rule-governed behavior exists. Admittedly, condition (f) should be reformulated so as to express that the rule R is a sentence that is formulated by the observer of P’s behavior, since it is of course out of the question that the acting individual is a person who explicitly knows R. In a definition of implicit rule following, (f) must mean that P behaves such that an observer who knows rule R can interpret the behavior of P as complying with R. The real problem for sketching out implicit rule following consists in accommodating conditions (b)–(e).

If one accepts the common conceptions of implicit rule following, whose development is motivated, in particular, by the attempt to solve what is known as the rule-regress argument—which arises when an attempt is made to explain the difference between correctly and incorrectly following rules in reference to compliance with rule following—then the general idea used to characterize implicit rule following consists in bringing into play a cognitive implementation of those capabilities that make rule-conforming behavior possible.⁴⁷On the one hand, the basic strategy here usually consists in drawing attention to the fact that we do many rule-conforming things without calling the rule to mind or explicitly drawing on the rule; in these cases we have to count routines as causes of rule-conforming behavior. On the other hand, it consists in pointing out that rule-conforming linguistic behavior, for example, cannot even be acquired by understanding and learning to follow rules, because this already presupposes that we speak a language; so that one has to expect a nonlinguistic, that is, a nonexplicating, mechanism (training) for generating rule-conforming behavior.

However, if we grant that there are sanctioning and nonexplicating mechanisms of training that lead individuals to develop the disposition to display rule-conforming behavior for an observer (under certain conditions), then the question naturally arises concerning how rule-conforming behavior that is so conditioned by regularities can be differentiated from rule-observant behavior. If one—in contrast to Wittgenstein—does not want to give up on this differentiation, then criteria must be presented that can be shown to be plausible with a view to the behavior of a person who is potentially implicitly following rules. To do this, it would be necessary to show that P’s rule-conforming behavior B is a rule-observant behavior and that P does not know the rule R that would make the observance of B rule conforming.

Of course there are numerous activities that we regularly engage in without having learned a rule for carrying out the behavior. But not all of these things are at all suited to serve as examples for the implicit rule following that is sought. For example, under certain conditions, to chew one’s nails in a certain way is nothing more than a habit, the expression of a psychic disposition; the activity that we are looking for, however, must be immune to a complete, mere dispositional, reconstruction. The “intelligent capacities” that Ryle, in his break with the “intellectualist legend,” calls on to demonstrate implicit rule following provide perhaps a more suitable example—capabilities that we acquire through formation; here Ryle assumes that we acquire the ability for rule-conforming action through practices. It is “schooled indeed by criticism and example, but often quite unaided by any lessons in the theory”;⁴⁸at the same time, he emphasizes that these abilities can be differentiated from habits that we acquire by training.⁴⁹But how can this distinction be secured? How can the regularist reduction of implicit rule following be prevented? And how can it be precluded that a person only has a chance disposition for rule-conforming behavior? Ryle brings the following criteria into play, which are also of interest for our problem:⁵⁰

1. P is provided with a corrective training.

2. P is able to criticize and correct the rule-deviating behavior of another, P_i, i.e., P is able to instruct P_i.

3. P is able to modify her learned behavior in innovative ways.

Naturally Ryle would like to say that P does not only behave in conformity with a rule, but also that her behavior offers strong evidence that she is rule guided, without having access to a formulation of the rule. However, if this case is examined in detail, then we are left only with vague evidence. The first criterion ensures that we, as observers of P, can draw on a history of her behavior so that when we know P’s training history, we can assume that P’s ability to behave in conformity with rules is owed precisely to this training. If we in this way assume that P’s capability for rule-conforming behavior is acquired, then everything is dependent on what Ryle means by “corrective.” For if the corrections come about like regular occurrences of nature, why should we speak of the training as procuring the ability to (implicitly) follow a rule? From the perspective of the observer of the training history, in any case, the question arises regarding which rule the trainer is following in his sanctions. In this way, however, we just displace the question of rule following from the student to the teacher.

Of course, much depends on how complex the capabilities are that we are assessing. Greeting others with a handshake is perhaps a habit that can be successfully taught with a sanctioning training, without the rule of the greeting ever becoming explicit. But how plausible is the talk of a habit or of nonlinguistic cognitive implementation in the case of complex competencies such as multiplication, which Ryle himself provides as an example? With a view to such capabilities, one rather wants to say that the fact that we perhaps also learn complex abilities from observing practices is no argument at all against the explicitness of rules, since we are, with a view to such abilities, almost forced to assume that P explains the rules that structure the more complex activities within the framework of her lessons in the form of hypotheses that explain the sanctioning behavior of the teacher. How else can we explain that P is able to develop a sanctioning practice herself that is suited to serve as a teaching practice for P₂if the concern is lessons in multiplication? P could indeed have acquired the disposition to sanction under certain conditions (namely, those of the deficient behavior of his student), but only for those cases in which P herself was positively or negatively sanctioned. However, because the lessons in multiplication that P enjoyed could only have a limited number of multiplication exercises, as a teacher or critic of P₂’s multiplication exercises, P must be able to draw on a prescript that enables P to correctly sanction in cases that did not arise within the framework of her education. Even if this prescript was not pointed out to P, P can only work as a competent teacher of multiplication or critic if P has developed the prescript in the course of forming hypotheses that can be applied in a potentially unlimited number of cases. Of course, such a prescript need not take the form of a mathematical formulation, but it should describe a method that, if followed, will allow P to reliably avoid the sanction of her teacher.

Ryle’s last criterion is not much better, for with a view to the creativity that is supposed to equip P to creatively accommodate learned behavior, from the interpreter’s perspective, the problem arises of how cases of creativity can be differentiated from cases of rule-breaking behavior. Relative to a cognitively implemented rule assumed by the interpreter, creative behavior can only be identified as such if the presumed rule is not fully followed. If it is completely followed, then the behavior is not creative. It would only be possible to identify with certainty that behavior is creative if P could name a modified rule. This, however, would, in turn, be explicit, and Ryle’s criterion of creativity would remain dependent on the “intellectual legend” that he is struggling against.

Above all, in the face of the often-confirmed fact that we do not recite rules when following them and that we do much by routine, it may appear counterintuitive to link rule following to explicit rules. However, it remains questionable how the difference between rule-following action and regular behavior can be secured without relating in one way or another to explicit rules. In my view, there is only one way of speaking of implicit rule following that allows us to accommodate this differentiation; this can be done, namely, if we introduce a division of labor between those individuals who know rules explicitly and can teach rule-conforming action and those individuals whose forms of action are rule conforming as a result of a sanctioning “teaching” practice. The following reflections form the background for this assessment: If we assume that individuals who are said to implicitly follow rules perform rule-conforming action that contingently conforms to rules, for example, because of peculiarities of the brain physiology of these individuals, then it is questionable why we should here speak of rule-conforming behavior at all. The only thing that might motivate us to do so is the fact that we can explicate a rule that the behavior conforms with. Yet, in this case we could just as well provide a natural law to explain the behavior, for if we cannot observe behavior that can be interpreted as the expression of a rule, what evidence would we have for justifiably ascribing an implicit rule? If speaking of rules is supposed to make any sense at all in such cases, then it is only under the condition that we know that the fact that people behave in conformity with rules is connected causally with explicit rules, for example, in the following way. We know or we justifiably assume that there was a generation of teachers who knew rule R and were able to explain it. This generation of teachers organized a sanctioning practice and placed little value on the explicit transmission of rule R. Then we have the fact that the children of the first generation of students behave in conformity with the rule without any longer having heard the rule from their poorly taught parents at all, a fact that stands in a causal relationship with the explicit knowledge of rules of the first generation of teachers. If we follow this depiction, then, as it were, we shift the explicit formulation of a rule that we, as observers of a behavior that is not caused by the explicit knowledge of the rule, accomplish—whereby this behavior is made a rule-following behavior in the first place—to individuals whose explicit knowledge is the historical cause for the behavior that we view as rule conforming. Of course, the historical framework that I have introduced here can be reduced in scale so that we can also speak of implicit rule following in the sense here in cases in which a person who behaves in conformity with a rule can no longer remember which rule was conveyed in the lesson in which the ability to behave in conformity with rules was acquired.

If, in the face of regular behavior, we want to avoid the arbitrariness of questionable assumptions of implicit rule following, then we must turn to a situation in which an explicit rule was the cause of regular behavior that we can thus describe as rule conforming. If we would like to avoid collapsing the concept of implicit rule following into the concept of regular behavior, we need recourse to a (historical) situation in which the rule was explicit. However, then the concept of implicit rule following is a historical concept that remains dependent on the concept of explicit rule following. With a view to our problem of analyzing rule-conforming composition as rule following, the following is also clear:

1. Because there was no explicit formulation of the standards for sonata at the time of Mozart, Mozart could not have conformed to these prescripts in the sense meant by explicit rule following.

2. Because there was not a generation of teachers before Mozart’s time whose explicit knowledge was lost but which caused the transmitted sanctioning practice, Mozart could not have transcribed the prescripts in the sense meant by implicit rule following.

3. The alternative between implicit and explicit rule following is exhaustive.

4. Thus, because Mozart’s rule-conforming composing cannot be explained either as explicit or implicit rule following, it cannot be described as rule following at all.

In view of the previous analysis, the problem that we confront is that the established methods for describing activities as intentional fail: common rationalizations of behavior confront us with the problem of the translatability of artistic thoughts and the escape promised by descriptions of implicit rule following—which were supposed to undermine the linguistic character of intentional states—has proven itself to be erroneous. However, in the face of the manifest difficulties of providing a suitable intentional description of Mozart’s composing, how could we sensibly say that it is possible to understand the product of this composition? Of the resources that it is possible to draw on to salvage an intentional description of composing, only two have been partially drawn on thus far: the history of the acquisition of the capability that made it possible for Mozart to write music in the sonata form (to be taken up in more detail in section 1 below); and a systematic investigation of what it means to understand musical thoughts (to be taken up in section 3.2.2). In what follows it will become clear that these two perspectives are related.

1. DO IT LIKE THIS: . . . !

In Ryle’s reflections, he outlined a training that he thought could transmit the ability to follow a nonexplicit rule, but if we view the training with respect to the conveyance of nonexplicit, yet explicable prescripts, then an element that is fundamental for the conveyance of artistic abilities eludes our attention. In the context of reflections on transmitting competencies for performing musical works, Joseph Kerman has pointed out that the transmission of this knowledge takes on a specific form:

A music tradition [of performance] does not maintain its “life” of continuity by means of books and book-learning. It is transmitted at private lessons not so much by words as by body language, and not so much by precepts as by example. . . . The arcane sign-gesture-and-grunt system by which professionals communicate about interpretation at rehearsals is even less reducible to words or writing. It is not that there is any lack of thought about performance in the central tradition then. There is a great deal, but it is not the kind of thought that is readily articulated in words.⁵¹

Teaching with the help of examples is not an anomaly in philosophy seminars, but in music instruction it has a different status. For the example in this case does not serve to illustrate an abstract relationship or to depict an exemplary application, but to illustrate the correct articulation of a musical thought. Before Kerman, Kant had already emphasized that the specific function of examples in music lessons is a necessary particularity of artistic instruction; interestingly, Kant also thematized this characteristic in contrast to explicit prescripts or rules that, according to his analysis, are unsuited to define the particularities of artistic action. However, because in Kant’s view the artifact character of art is also dependent on rules, he had to postulate that these rules cannot have a conceptual character of the sort that allows them to function as premises of an aesthetic judgment. The rules of beautiful art can thus not be invented rules; rather, “Nature in the subject must . . . give the rule to Art, i.e. beautiful Art is only possible as a product of Genius.”⁵²Geniuses, however, are people whose talent is suited to produce “that for which no definite rule can be given,”⁵³that is, individuals who are inventive; it is the work of inventive people that is suited to assume the role of paradigmatic examples in art lessons. However, if we accept the difficult thought that it is nature that provides beautiful art with the rule, then what type of rule are we speaking of?

It cannot be reduced to a formula and serve as a precept, for then the judgment upon the beautiful would be determinable according to concepts; but the rule must be abstracted from the fact, i.e. from the product on which others may try their own talent by using it as a model, not to be copied but to be imitated. How this is possible is hard to explain. The Ideas of the artist excite like Ideas in his pupils if nature has endowed them with a like proportion of their mental powers. Hence models of beautiful art are the only means of handing down these Ideas to posterity. This cannot be done by mere descriptions.⁵⁴

Kant’s difficulties in reconstructing art lessons are instructive: since even in cases where the students are able to abstract the rule from the product, they are not allowed to orient themselves on an explicit rule in their act of producing, so in the final analysis, Kant can only describe the lessons with psychological concepts; for all that remains for him are effects of the paradigmatic work of art. Under the presupposition of the similar proportion of mental powers, these effects should generate similar ideas in the students. In other words, Kant’s explanatory problem arises because Kant does not have a concept of understanding that is suited to describe art lessons as a process of understanding and that allows him to reconstruct this, without fissure, in an intentionalist vocabulary.

Kant’s discussion of nature as a power that provides art with the rules has broad-reaching consequences for the reconstruction of the inner experience of the artist from an intentionalist perspective. For a (brilliant) artist need not

1. be able to describe how she generated a work of art;

2. know the way that ideas are related to a work of art;

3. be capable of artistic productivity according to a plan;

4. be able to convey to others, in the form of precepts, how one creates a work of art.⁵⁵

If, however, all of this knowledge cannot be expected from an artist, then it is clear that an artist has no other means of teaching than her own works (or those of others), and it is also clear that, according to Kant, there are limits to the ability to teach artistic competencies. Indeed, it is possible to learn everything that Newton presented in his principles of the philosophy of nature, “but we cannot learn to write spirited poetry.”⁵⁶Regardless of whether one accepts Kant’s characterization of an artist as “him whom nature has gifted,”⁵⁷it is very true that “no master has fallen from the skies”; we do assume, for example, that even Mozart could not have composed his pieces without having learned something. Yet, what did Mozart learn from his applied lessons? If one views music lessons with a temporal standard that we also have in mind with a view to language acquisition, then we can certainly say that Mozart—like most schoolchildren today—learned to sing scales, to keep time, to beat simple rhythms, etc.; indeed, he learned this on the basis of the examples of someone who performed these acts. If it turns out that a student has difficulties with the transmission of one of these basic musical competencies, then, by exaggerated singing or performance, the teacher can normally indicate the attributes that were lacking in the student’s rendition. In doing this, the teacher hopes—by the exaggeration—to draw the student’s attention to precisely those attributes, for she normally could not point to a rule or a precise prescript that can help the student (“Sing the third tone 80 percent higher!”) nor could she count on achieving success with such a prescript.

If the student manages to internalize the elementary ability to produce tones at a relatively constant pitch and at relatively constant intervals, the teacher indeed goes on to focus on conveying more complex abilities, but the principle of the lessons remains the same. This principle is: “Do it like this . . . !” And to define “this,” the teacher has at her disposal the technique of exaggerated singing or performance and a repertoire of mimics, gestures, and nearly dancelike movements, and now and then also the suggestive power of linguistic comparisons (“Do it like you were an old tired horse that is pulling a heavy cart!”).⁵⁸The objective of these techniques is to structure the playing of the student and to affect or anchor her listening, and indeed to do so by means of a demonstrative structuring by example or through the appeal to structuring abilities that the student, on the basis of her own experiences, has already acquired.

An elementary music lesson is thus largely free of the transmission of explicit rules and essentially consists in the conveyance of abilities to generate musical events in accord with exemplary patterns and to hear acoustic events as musical events. The latter consists, above all, in the ability to organize the flow of acoustic events according to types of activity that one learned oneself in music lessons. It means that one can sing along with a musical performance. If, for the sake of clarity, we accept that the teacher has limited herself merely to familiarizing the student with an extremely reduced form of music—4/4 time, two octaves of the C major scale, and two tone durations (the half and the quarter notes), as well as two dynamic levels (loud and soft)—then a student of such elementary lessons has had successful lessons if she is able to perform within the framework of the possibilities that emerge from these combinations. In the case of this paltry music, with the help of twenty elementary types of activity, such a student would be able to perform a total of sixty types of activities that generate musical events.⁵⁹A student who has completed this paltry class has not learned any rules, but has acquired know-how that enables her to perform a set of standardized actions; she has acquired a disposition to hear music as a performance of those types of action, a disposition that can rightly be understood as a part of one’s second nature—from now on it will be nearly impossible for the student to hear music as a mere acoustic event.

Now let us suppose that this analysis (irrespective of the paltry nature of the lessons) is somewhat instructive. What have we achieved that surpasses Ryle’s examples? In contrast to Ryle’s examples of a game of chess or of multiplication, we can well imagine that not only the teaching but also the musical practice can manage without rules, even if rules play a large role in our musical tradition. While chess games and multiplication, even if they are taught without reference to rules, are activities that could not arise without linguistic rules, and in the best case, these can be analyzed according to the model of implicit rule following (described above), there is no reason not to imagine that people transmit a music culture without rules guiding it.

Doing this, however, only indicates the first aspect of the difference between chess matches or multiplication and making music. Let us compare the following two cases. Even if we doubt for good reason that there are chess players who would be “instructed” according to a Rylean lesson plan, and in this process would develop a disposition that makes them capable of playing chess, we could imagine, for the sake of argument, a chess player of this sort and assume that this dispositional chess player makes a mistake in the course of a match. In the second case we ought to imagine a dispositional singer, that is, someone who has completed our elementary lessons, making a mistake in the course of a melody by singing a note that isn’t in the scale of her music tradition. If we compare these cases with respect to the possibilities that witnesses to the events have available to criticize these mistakes, then we find an interesting difference.

The witness of the chess game might say: “The knight can only be moved so and so, but not the way you have moved it. Your move isn’t allowed!” And the player who initially took back his move might again make a false move and look to his critic with a nod. The critic then could insist on his criticism and finally tell the player what he doesn’t know: “The knight can only be moved from the field in which it is placed to another field that it can reach if the following rule is followed: ‘Move one field straight and one diagonally and increase at every step the distance from the position you started in.’” And he might add this: “The color of the field occupied by the knight changes after every move.” What the critic does if the mere reference to the false move is insufficient is nothing other than explicitly state the rule that applies to the figure in question. A critic of someone who multiplied falsely would hardly proceed differently; for if a dispositional multiplier makes a mistake, then the remark “No, three times five is not sixteen, but fifteen!” will hardly achieve anything. Her statement would counter the statement of the one doing the multiplication. But how does the critic know that she is right, and how can she clearly demonstrate to the person doing the multiplication that she is correct? She will name a prescript that the person doing the multiplication does not know, but that leads to results that conform with those of the person doing the multiplication fairly often. She will say, “If you want to multiply two numbers, then note in a row the number of lines corresponding to the first number. The second number then indicates how many rows you have to fill with this number of lines. . . . When you are finished doing that, count all the lines. The number of lines of all the rows is the correct result of the multiplication problem.”⁶⁰Knowledge of such a prescript ensures the critic that her criticism was justified, and it forms the basis of her criticism, for with its help the critic can ensure that her own result is correct.

But what rule or prescript does the critic of the false singer draw on? In most cases, on none at all! Let us assume that the tone that the singer is supposed to sing in order to remain within the parameters of the key that has been transmitted is a fifth higher than the directly preceding note: what prescript could the critic state to ensure herself, but also to provide a basis for her criticism? She could sing the tone correctly, but as a rule she would not be able to state a prescript that she has followed in doing so. In musical cultures like ours, which are accompanied by a long history of the scientific study of its physical foundations, the critic could repeat the Pythagorean view of the tetrachord and point to the pitch relationships of strings. She could say that a key stands in the relationship of a fifth to another one if the lengths of the strings that produce the tones are related to each other like two to three. But the ability to use this procedure is a particularity of our music culture. In cultures that make music without recourse, or the possibility of recourse, to a music theory, the critic has no choice but to say: “The way that I am doing this is right!” Why? “Because that’s how we sing!” However, this means that the critic cannot legitimate or base her criticism on a prescript; rather, just like the teacher, she can only draw on an act of deixis. The final authority for her criticism consists in the actions of those in her culture who are thought to know how to do it. She can only point to how they do it. Beyond the existence of correct performances, which form the elements of a system of deictic references, in music cultures there is no authority for rightness.

What makes the case of the elementary music lesson theoretically interesting is thus the particularity that making music can be a practice in which there is right and wrong, but the right (and wrong) can only be demonstrated in deictic actions. However, this also means that the articulation of what is right remains dependent on its performance; and for this type of rightness, a description of rightness with the help of prescripts is not constitutive. We thus should distinguish between two forms of rightness:

(R1) If you (in the context of practice X) want to do it right, then do it like this . . . [an instantiation of the type of activity follows]!

(R2) If you (in the context of practice X) want to do it right, then do it in such a way that you conform to prescript P!

For my reflections, what is most important here is that the articulation of what is right itself need not assume a linguistic form. (R1) of course makes use of linguistic means of reference and the mention of what is right, but what makes what is right right is not characterized linguistically, but is itself demonstrated by means of a nonlinguistic medium. Here it is important to see that formulations of the type (R1) are not common rules. For, by virtue of their deictic components, sentences in the form of (R1) are always connected to contexts in which a right action is performed; although they apply for certain contexts of application, they lack the independence from contexts for expression that are characteristic of rules. Nevertheless, the deictic illustration of what is right can be understood as a form for making rightness explicit, albeit with the peculiarity that language here only has the function of ensuring the deictic relationship and not of describing what is right or of explaining it.⁶¹If we assume a deictic form of explicitness in which the performances of the actions demonstrate rightness, we can adhere to the view—from the idea represented above—that rules are of necessity explicit (or in cases of historically implicit rules, must have been explicit); now, however, the analysis of practices in which there is indeed a “right” and a “wrong,” but there are no explicit prescripts, no longer presents a theoretical challenge that makes the opaque concept of implicit rule-following attractive. Before I, in what follows, examine the consequences of these insights for the understanding of nonlinguistic works of art, I would like to draw on the returns of the previous reflections for the further development of our still quite rudimentary media-theoretic vocabulary.

First, the vague criterion (M1)⁶²can be made more precise. For one, in the course of the reflections it has become clear that the behavioral possibilities are types of activities; further, it has become clear that they are activities that can be learned. (M1) thus should be replaced by (M3).

(M3) Every medium includes a limited number of elementary types of activity, understood as in (M2), which can be learned.

In the case of our elementary musical medium, there would be precisely sixty such types of activity; as long as there is, for example, no possibility to bring about a musical event that has a certain length and a certain dynamic value but that lacks pitch (as in the case of many percussive tones), only combinations of dynamic, duration, and pitch—attributes that can be isolated from the interpreter’s perspective—will bring about elementary musical events. In order to make an independent term available for such events, I would like to propose the following definition:

(M3.1) States of affairs in the world or events, understood in the sense of (M2), which arise or are generated by the performance of types of elementary medial activity, are called media elements.

In the reduced form of music of the fictive elementary music lessons, every musical event is an implementation of a media element, and each of these media elements can be identified by three attributes, which are attributes from three discontinuous attributive sets (fifteen pitches, two dynamic levels, and two tone durations). In any halfway developed music—especially, however, in other forms of art like painting or sculpture—there are events or states whose attributes cannot be chosen from discontinuous attributive sets, but must come from a continuum of possible attributes. In contrast to the medial elements that can be chosen from discontinuous attributive sets, the difference between two medial elements in regard to such an attribute can be arbitrarily small so that, between two arbitrary elements, a third can always be found. So, for example, between two tones whose dynamic attributes might be selected from a continuum, it would be possible to bring about a third, which is louder than the quieter one and quieter than the louder one. In other words: if x and z are arbitrary tones, in a musical medium in which the dynamic (D) is a continuous attribute: ∧x ∧z([D(x) > D(z)] ⊃ ∨y[D(x) > D(y) > D(z)]).⁶³In contrast to our elementary musical medium, media in which medial elements can have attributes that are instantiations of possibilities in an attributive continuum allow the formation of an unlimited number of types of medial elements.

The definition (M3.1) primarily has an economic value, for it allows a somewhat less cumbersome manner of speaking; this facilitates the use of it in reference to the products that implement the types of activity, which in many cases interest us, especially as recipients, more than the productive activities. However, (M3.1) also underlines the dependency of the states or events on the implementation of types of activity; this serves to prevent one from mistakenly thinking that these states or events can be identified as such medial events or states independently of their genetic relationship to the practice in which they are generated. In this way, the medial elements stand in a dual relationship to social practices, for, from the perspective of an interpreter or recipient, they are identified with a view to certain nonnatural, nonintrinsic, or observer-relative attributes (see above, pitch versus frequency) and with a view to a certain production history, which expresses the artifact character of the media elements. This precludes a “C” note produced in the desert by the wind blowing over an empty bottle from being a medial element.

With definitions (M3) and (M3.1), however, we still do not have the conceptual means that would allow us to describe music making as medial action, for it is of course clear that bringing about an occurrence of a type of musical activity (singing a quiet G with a half note duration) or bringing about a medial element is not yet making music. However, that someone articulates herself in a medium can be expressed under the presupposition of (M3) as follows:

(M4) To articulate oneself (or something) in a medium means that one brings about performance sequences of the type of activity that is specific to this medium.

Someone who has learned to move within the parameters of the inventory of the activity types thereby acquires the possibility of combining these types of activity; although this person may not possess a linguistic individuation mechanism for these types of activity, it is clear that the person’s disposition to sing the tone at a certain (right) pitch is the prerequisite for the tone being among the materials that the person has to choose from. Further, it would be useful to have a non-media-specific concept for the product that is brought about by a sequence that instantiates medial possibilities.

(M5) The product of the performance of a sequence of instantiations of media activity types is called a medial constellation.

(M5) ought to accommodate the fact that while all medial products are brought about by sequences of actions, these products themselves do not necessarily have a sequential character. For while in the context of music (or dance), an arrangement of medial elements always has a temporal and thus sequential character, in painting and sculpture this is not the case. With recourse to (M3.1), we can also say:

(M5.1) Every arrangement of medial elements forms a medial constellation.

In a provisional last step in the development of a media-theoretic vocabulary, we can now introduce an expression that allows reference to the possibilities that are provided by a medium for bringing about a medial constellation.

(M6) The set of all possible constellations in a medium M is called the scope of possibilities for M.

The size of this set is, of course, only different from infinite if there are, for example, in our case of elementary music, constraints on the length of the songs that can be sung in the framework. The function of the concept of the scope of possibilities, however, is not to organize comparisons of the size of the media; rather, it is to provide a concept for what is allowed in a medium, which, as the previous reflections have shown, does not require explicitly formulated criteria.

What have we now gained from the analysis of the elementary music lessons and the specifications of the terminology? Are we nearer to the goal of explaining what it means to have, or to understand, a musical thought? And above all, does this analysis provide us the theoretical means to explain which thoughts Mozart articulated when writing a sonata phrase? Let us thus turn back to the problems that arose in the talk after the concert. In the framework of this discussion, we came up against the problem of explaining the presuppositions that a recipient must fulfill so that we are assured she understands a piece of music.

With the help of the terminology that has been introduced, the criterion—namely, that the recipient must be capable of relating to the identity of the piece of music—can now be specified since we can now say that a recipient must be able to relate to the piece of music that was listened to so that she heard it as a medial constellation, in other words, that she heard the performance as an arrangement of medial elements. Beyond this, the recipient has to hear the performance not only as just any medial constellation whatsoever, but as a specific one: that is, she must not only have a more or less well-articulated hypothesis about the scope of possibilities against which the piece was carried out, but she must also have a hypothesis about the specifics of the sequence of choices that the medial constellation can be depicted as being the product of. If a recipient manages to relate more or less to the position that a work assumes as a medial constellation in a medial scope of possibilities, then she succeeds in referring to the compositional identity of the work. The idea of explaining the conditions necessary in order to understand works of art with the help of a concept of compositional identity can then provisionally be specified by means of media theory as follows:

(CI₀) It is possible to relate to the compositional identity of a medial expression by linguistic means by describing a work as a specific medial constellation within the scope of possibilities of a medium.

But to what degree do the preceding reflections contribute to reconstructing artistic activity as actions and to understanding nonlinguistic expressions? In short: to what degree do our reflections do more than merely redescribe the problems in media-theoretic terms?

1. If media provide sets of basic activity types, then the sequential performance of such activities has a different status than the performance of mere dispositional activities. This status allows these activities to be closely related to actions; however, in the process of providing explanations, we need not appeal to reasons in order for these to count as actions. That these activities have the character of actions is not dependent on the common intentional vocabulary. This is for the following reasons:

a. One can succeed or fail to bring about a type of activity. Consequently, rightly bringing about some type of activity is subject to specific normative ideas (Vorstellungen), which need not be linguistically articulated, but which can be made explicit through acts of deixis. Although the normativity that is in play here does not manifest itself in explicit linguistic prescripts, it is not reducible to the functional level. For we do not explain what correctly carrying out a type of activity is—as in the case of types of behavior that have arisen in the course of evolution—with a view to a function that this type of activity commonly has. A factual but contingent practice constitutes the basis of the normativity of the practice, not its function. (Do this the way that we do!)

b. Bringing about a sequence of types of activities that we have learned is different from performing dispositional activities under certain environmental conditions, because each time such an activity is carried out, it can be described as a choice against the background of alternative activities. The act of carrying out this activity, however, does not appear to be a choice for just the media-theoretic interpreters of the action, but also for the persons who carry it out; they themselves perform the activity with knowledge of alternatives, with knowledge of its arbitrariness, even if, under certain circumstances, they cannot provide a reason for their choice. With the help of this attribute, the performance of medial activities can be interpreted as action if we understand action, in a basic sense, as chosen activity. This, however, provides a fundamental concept of action that can form the foundation for an action-theoretic reconstruction of artistic activity insofar as it does not bring any mandatory propositional implications into play.

2. With the still vague concept of compositional identity, means emerge for an identifying reference to nonlinguistic expressions; with the help of this, it is at least possible to articulate necessary conditions for understanding.

Hereby, of course, only the first steps are taken in reconstructing artistic action and in reconstructing what it means to understand nonlinguistic expressions. However, the central question still remains open: namely, what might it mean to think or understand a nonlinguistic thought? It is indeed clear that in the case of understanding linguistic expressions, too, we must pay attention to the structuring and the analysis of expressions. However, our concern is not the grammar of the speaker’s language; our concern is to understand her thoughts. Herewith I come again to the above-posed question (see p. 155): What might it mean to understand a piece of music? In the following section I will first attempt to create a place in the philosophy of mind for the idea that there are thoughts whose content and identity are not dependent on their position in an inferentially organized network of propositions.

3.2.2. Revisions in the Philosophy of Mind

3.2.2.1. Can Mind Think Without Language?

Davidson has claimed that we should only ascribe thoughts to beings that we can ascribe beliefs to; beyond that, he has attempted to show it to be plausible that the only beings that can have beliefs are those beings that can refer to states of being convinced of something. This was understood to mean that they are able to view the propositions that indicate their respective beliefs as true or false. Only if a being is able, with the help of such second-order beliefs, to draw a distinction between (subjective) beliefs and objective truths does it makes sense to ascribe to this being the ability to have beliefs. However, within the framework of an interpretationist theory of mind, this ability should only be ascribed to beings if they possess a sufficiently complex language, which they use to contrast their subjective beliefs with objective facts within an intersubjective linguistic practice, and in this way refer to an intersubjective truth. It is clear that, in these reflections, Davidson interprets the criteria for thoughts very narrowly—too narrowly in the view of most naturalists, who do not want the discussion of contentful states to be reserved for thoughts as Davidson understands them. However, even in the face of this well-motivated reservation, which my concept of A-intentional states in part attempts to accommodate, I do not want to maintain that Davidson underestimates the importance of such states for the philosophy of mind; rather, I want to argue that more types of mental states are contentful for the beings that have them than his theory allows.

With the help of the previous discussion of higher-level intentional states (thoughts), I have attempted to show that even if Davidson’s analysis about those intentional states that are beliefs is correct, this by no means precludes other intentional states from having the status of thoughts, which while not being propositional attitudes are still correlated, like propositional attitudes, to an expressive activity that is sufficiently complex (for thoughts). As already noted, thoughts are individuated by beings who have them; they are not simply discovered or produced by a representational machinery. However, if we expect that beings refer to their thoughts by individuating them, then we obviously require a second-order relation, since a being’s reference to its thoughts can only be explained by indicating that the being draws on intentional states with the help of other intentional states. Put differently, a being can only have (nonlinguistic) thoughts if it has thoughts about thoughts. If this is correct, then how is it possible to explain the second-order relation—which Davidson, in reference to beliefs, explains with the help of the truth predicate—with a view to nonlinguistic thoughts?

If we once again call to mind the chief attraction of Davidson’s analysis, then it becomes clear that Davidson can manage both the problem of the individuation of thoughts and the problem of second-order reference in one theoretical vocabulary. For, with the help of the truth predicate, he can explain both the identity of a proposition as well as the ability of a being to behave with regard to this proposition. Because Davidson does not reckon with nonlinguistic thoughts and thus is able to deal with thoughts analogously to propositions, he entrusts the individuation of thoughts—parallel to the individuation of propositions—to their position in a net that stretches through logical (i.e., truth-functional) relationships:

Thoughts, like propositions, have logical relations. Since the identity of a thought cannot be divorced from its place in the logical network of other thoughts, it cannot be relocated in the network without becoming a different thought.⁶⁴

Beyond this, however, Davidson can also make a being’s ability to have a thought dependent on whether this being can distinguish between two scenarios: the scenario of possessing a logically individuated propositional attitude and the scenario of knowing whether this attitude is true or false. If higher-level, nonlinguistic intentional states cannot be introduced as truth-apt states, the thought of relational identity, which integrates Davidson’s conception, has to be modified so that the individuating power of the relations are made independent of the concept of truth. With the concept of compositional identity I have thus far only introduced a working title for an identity principle with the help of which it should be possible to explain the individuation of nonlinguistic thoughts; however, insofar as the debate about the problem of second-order relations is dependent, from the view of theory formation, on a solution to the problem of individuation, then, first of all, a robust view of the individuation of nonlinguistic thoughts must be developed.

3.2.2.2. An Identity Principle for Nonlinguistic Thoughts

If one attempts, with the help of the principle of relational identity, to check whether two linguistic thoughts, T₁and T₂, are identical, then the principle of relational identity can be formulated as follows:

(RIT₁) A propositional state or a propositional thought T₁is identical with a state T₂if and only if T₁and T₂have the same position in the network of propositional thoughts of a person P₁.

Among other things, that means that T₁and T₂have the same truth conditions, but generally that T₁and T₂can be substituted for one another without the identity of the thoughts T_nand T_mchanging, with a view to which T₁and T₂have been individuated. Here what is interesting about (RIT₁) is that the identity of a thought obviously cannot be determined without there being other thoughts whose identities, for their part, refer to the identity of other thoughts.⁶⁵

Because, from the interpretationist perspective, the identification of a position in the network of propositional states is the result of an interpretation—and self-interpretation has no primacy in this schema, but in contrast, self-interpretation is explained according to the model of interpreting others—it must be possible to reformulate (RIT₁) so that (RIT₁) provides the conditions for identifying two different persons’ thoughts, the person being interpreted and the person who is interpreting. For the interpretation of others, the relational identity criterion assumes the following form:

(RIT₂) A propositional state or a propositional thought T₁of a person P₁is relationally identical with a state T₂of P₂, which interprets an expression E₁from P₁if and only if T₂has approximately the same role in the network of propositional thoughts from P₂as T₁has in the network for propositional states from P₁if P₂thinks that T₂is true.⁶⁶

With a view to interpreting others, above all, it is clear that the access to a thought is only possible by means of expressions; however, because of the primacy of the interpretation of others, this also applies to self-interpretation. An interpreter correlates an expression of the one being interpreted with a metalinguistic specification of its truth conditions; hereby she integrates it into the network of her beliefs, which, however, the interpreter in essence must at the same time read into the one being interpreted. What both of the provided versions of the relational identity criterion formulate for thoughts must thus have an equivalent at the level of the expression.

(RIE₁) A propositional expression E₁from P₁is relationally identical with an expression E₂from P₂if and only if E₁and E₂can be substituted for one another when expressed under sufficiently similar circumstances, thus that E₁and E₂can play the same role in processes of understanding (in similar situations).

Two persons who think relationally identical propositional thoughts have the disposition, when making utterances under sufficiently similar circumstances, to articulate relationally identical expressions. Here the identification of type-identical expressions can initially be ceded to an ability to understand types that, for its part, introduces no further burdensome presuppositions into the concept of relational identity.⁶⁷

In the face of these criteria, what view now emerges about medial expressions and the mental states that are correlated with them? What must an analogous criteria catalogue for nonlinguistic thoughts look like that provides content to the provisional discussion of compositional identity? In order to answer these questions, it makes sense to develop the criteria, beginning with the expressions, because at the level of the expressions no ideas about the constitutive attributes of nonlinguistic thoughts need yet be brought into play; rather, restrictions for the characterization of these attributes can be obtained. An obvious reformulation of (RIE₁) for nonlinguistic expressions is:

(CIE₁) A (nonlinguistic) expression E₁from P₁is compositionally identical with a (nonlinguistic) expression E₂from P₂if and only if E₁and E₂can be substituted for one another when the conditions under which the expressions are made are sufficiently similar.

The substitution criterion that can, with a view to linguistic expressions, take recourse in a set of criteria, including truth invariance, requires different proof criteria for nonlinguistic expressions that, however, must transcend the presupposed type identification by perception that is active in both cases. In order to develop an independent concept of compositional identity and to specify the type of identity that might exist between the expressions on this side of the relational identity, it makes sense, first of all, to compare linguistic expressions that have identical attributes in certain respects, but that do not fulfill the established criterion of relational identity.

1. Someone from England says, “You owe me a billion pesetas.”

2. Someone from America says, “You owe me a billion pesetas.”

The expression “billion” is produced in both cases in the same way, but with a view to the relational position of the sentence in the context of British English or American English, there is a difference of at least 999 thousand million—or billion—pesetas. Thus, the two expressions have quite different truth conditions. Both sentences sound the same, but the identification by means of sense perception is not sufficient for compositional identity, as a look at the expression of the philosophically indispensable parrot shows:

3. The parrot Alex says: “Alexcanspeak!”

4. Tim, Alex’s caretaker, says: “Alex can speak!”

The fact that we perceive Alex’s expression and the expression of Tim as type identical does not imply their compositional identity, for we cannot presume that Alex synthesizes his expression from syllables that, as linguistic unities, are at the basis of Tim’s expressions. Perhaps the difference comes out more dramatically in the following example:

5. The English Garden in Munich.

6. An extraction from nature that by chance looks exactly like the English Garden in Munich.

A comparison of the three pairs of examples shows that only the first case has compositional identity; only in the first case can we presume that the type identity of the expressions is a result of the type-identical production of the expressions. The speakers draw from an established inventory of sounds, from which they form their expressions.⁶⁸From the perspective of the recipient, the compositional identity is not based solely on identification by perception; it is based on identification of something as something. The conditions for the possibility of such a something-as-something identification can now be theoretically used to specify the conditions under which nonlinguistic expressions can be substituted for one another. However, here we must dispense with inferential relations because, given the lack of the truth aptness of the products of nonlinguistic articulation, truth-functional relations are not available to determine identity. However, if we return to the media concept that has been introduced, we find the theoretical resources with the help of which it is possible to provide conditions for substitution. For with the aid of the media concept, a scope for the possible types of action can be fixed, formed by varying and combining elementary types of action. In this scope of action a nonlinguistic expression has an identity:

(CIE₂) A (nonlinguistic) expression E₁is compositionally identical with a (nonlinguistic) expression E₂if and only if E₁and E₂can be described such that two persons P₁and P₂have generated the expressions by following the same sequence of choices (from a hypothetical inventory of possible choices).

If the compositional identity of two nonlinguistic expressions legitimizes their reciprocal substitution insofar as it formulates necessary and sufficient conditions for an acceptable substitution, then, with the concept of compositional identity, a concept is also available with the help of which it is possible to determine the identity of those mental states that are causes for the articulation of nonlinguistic expressions, and indeed in the following way: In compliance with an interpretationist perspective, the explanation of the criteria of relational identity leads, first of all, from the level of thoughts to the explanatory more fundamental level of articulation. If (CIE₂) is an acceptable criterion for the compositional identity of an expression, then in reversing this in a second step, it is possible to move on from the level of nonlinguistic articulation to the formulation of a criterion for nonlinguistic thoughts. Such a criterion comes into purview if we assume that nonlinguistic thoughts indicate dispositions to bring about nonlinguistic expressions.

(CIT₁) A (nonlinguistic) thought T₁is compositionally identical with a (nonlinguistic) thought T₂if and only if T₁and T₂dispose a person P₁(or a number of persons) to effectuate compositionally identical expressions.

If the compositional identity of the expressions can be explained in reference to the idea of choice from among a scope of possibilities, then the identity of the thoughts can be explained by employing the idea of options; from an interpretationist perspective, the compositional identity of the observable results of action is the only possible basis for the compositional identity of the dispositional causes and expressions that are being ascribed. If we assume that the alternative forms of articulation are kept in existence by social practices, a criterion for the identity of nonlinguistic thoughts can be formulated as follows:

(CIT₂) A (nonlinguistic) thought T₁is compositionally identical with a (nonlinguistic) thought T₂if and only if T₁and T₂occupy the same position within the scope of possibilities of the medium.

On the basis of this quite abstract discussion of the identity of nonlinguistic thoughts, how does one now explain what it means to think a nonlinguistic thought? Or, with a view to the example above: what does it mean to think a musical thought? An answer to this question, for its part, can be given in an analogy to linguistic thoughts. If we ask ourselves what characterizes that state of thinking a linguistic thought, we can say that thinking a thought disposes a thinking person to act in a certain way, which is connected to the content of the thought. Someone, for example, who is convinced that the water in his glass is poisoned will not drink it if he does not want to die, and so on. However, besides such consequentalist implications of the belief, which can be expressed in numerous ways, the person certainly has a disposition to express a sentence that provides the content of his belief. Someone who has a thought has this thought by virtue of expressing it internally. In short, linguistic thinking is inhibited speaking. To think a musical thought means, in strict analogy to these reflections, to be disposed toward a musical expression, which means to imagine or to inhibit a musical expression. In contrast to a linguistic expression, which can be translated into a language different from the speaker’s, a work of art, which articulates a musical or more generally a nonlinguistic thought, can only be reformulated with the help of compositionally identical expressions. Nothing beyond the compositional identity of such a thought is translatable. We only have access to its content by thinking it ourselves by “quietly (i.e., silently) singing” it.

If we call to mind how conductors or members of small musical ensembles attempt to familiarize other musicians with a musical thought, then we see that competing interpretations of a work lead to different musical thoughts, and their best expression is, in each individual case, apparently of the following sort: “Play daa-dad-daaa-da-ratatataaaa-dam.”

Before I attempt to develop a clear, more graphic view of the content of nonlinguistic thoughts, I would like to generally characterize the type of intentional state that we are dealing with if my analysis of nonlinguistic thoughts is correct, and I would like to suggest a more complete integrative depiction of the mental.

In the examination of Searle and Davidson, I earlier⁶⁹suggested characterizing a sort of intentional state (A-intentional states). While these indeed have content, their individuation remains dependent on current or represented conditions of satisfaction; in addition, they have no content for the beings that find themselves in such states (as long as these beings only have A-intentional states). I contrasted these intentional states with higher-level intentional states (thoughts), the most prominent representatives of which are, without a doubt, linguistic thoughts, thus propositional attitudes. Under the presupposition of the reflections on the identity of nonlinguistic thought, however, the theoretical means are also now available that allow us to show that there is a reason to include nonlinguistic thoughts in the class of higher-level intentional states. Here, the class of higher-level intentional states must be internally so differentiated that its internal structure can sufficiently accommodate the differences between nonlinguistic and linguistic thoughts and their commonalities.

In order to guard against the suspicion that might arise—that classifying nonlinguistic intentional states as thoughts is primarily due to terminological sophistication—there is, however, a need to systematically characterize higher-level intentional states. In drawing this out, it is useful, first of all, to more precisely consider than I yet have what A-intentional states are. Above in (A1)⁷⁰I noted that mental states are those states that dispose beings to differentially respond to their environment and to adjust in a way that precludes being steered by the being. In addition, these states have an identity (that can be articulated by the interpreter of B’s behavior), which can be explained in reference to ideas of conditions of satisfaction. As a model for A-intentional states, which I hope is intuitively illuminating, I have introduced a state that, from the perspective of the interpreter of the behavior of a small child that has not yet come to speak a language, might be called a “nonlinguistic desire.”⁷¹

If we characterize the systematic particularities of such states in Searle’s terminology, then we can say that such a state is intrinsically intentional, which ought to mean that it has an implicit relationship to its conditions of satisfaction. A being that finds itself in such a state thus eo ipso has a “consciousness”—avouched for by representational mechanisms—of the conditions of satisfaction specific to this state. This “consciousness” of the conditions of satisfaction is of course not such that a being with A-intentional states could not make these conditions of satisfaction explicit. But beyond the act toward which a state disposes the being, such an A-intentional state must be displayed by a fulfilling or frustrating or surprising behavior on the part of the being that is specific to the state. And this behavior must be a result of the being’s ability to differentiate whether the conditions of satisfaction that are specific to its A-intentional state are fulfilled or not. However, it can have the ability to make this distinction only on the basis of internal representational competencies; for, because of their world-to-mind direction of fit, in the case of desires, the only possible criterion for this competence for making this distinction lies in the being in question itself.⁷²In the context of the reflections here, the form this representation of conditions of satisfaction assumes in the experience of this being can remain open; that is, it can remain open whether it assumes sensory-motor, pictorial, or another form of representation. However, we must then say, regarding the conditions for the individuation of A-intentional states, that these conditions are fulfilled by a sufficiently complex neuronal machinery that human beings contingently happen to have. The interpreter is indeed needed to describe them, but not to establish them; for a (self-)interpretation is not constitutive for the existence of A-intentional states.

To find oneself in an A-intentional state thus means to find oneself in a state that represents conditions of satisfaction. As Searle rightly noted, we can thus not sensibly speak of the meaning of such states because we cannot sensibly distinguish between the state and its content. Searle, however, assumes that this is not a particularity of a certain class of intentional states, but that it applies to intentional states in general. In Searle’s words, “Meaning exists only where there is a distinction between Intentional content and the form of its externalization, and to ask for the meaning is to ask for an Intentional content that goes with the form of externalization . . . but it makes no sense to ask for the meaning of the belief.”⁷³

If we now ask how the content of an intentional state can become the content for a being that has this state—thus how a being can reach the level of higher intentional states—then something must come into play that, for its part, is not a representation of conditions of satisfaction, that is, something that is not intrinsically intentional; for every further A-intentional state would be no more than a further representation of conditions of satisfaction and would thus not be available for an arbitrary and potentially interpretive relation to another A-intentional state. The problem consists, namely, in the fact that it is impossible to imagine that there is an A-intentional state I(_a1) that typifies the conditions of satisfaction of an A-intentional state I(_a2). There would simply be no criterion by which to distinguish these two A-intentional states. In other words, A-intentional states that have the same conditions of satisfaction are the same states.⁷⁴

Here we have reached a theoretical point, however, from which it is no longer possible to follow Searle’s theory of intentionality because it does not offer the theoretical resources with the help of which we can reconstruct the level of higher intentional states, and this is for the following reason: If intentional states are introduced by the concept of the conditions of satisfaction, and their identity is dependent solely on, or typified by, this concept,⁷⁵then we cannot see how a being that only has intentional states at its disposal that are characterized in reference to their conditions of satisfaction is able to have intentional states that refer to other intentional states; every state that potentially refers to another intentional state would be a state that could manage this solely by means of its conditions of satisfaction. Every successful reference to a state I(_a1), however, would have to be able to identify its conditions of satisfaction, and how would that be possible unless the reference-taking state I(_a2) had the same conditions of satisfaction? Then, however (see above), I(_a2) would be identical with the state I(_a1). This relation can also be clarified as follows: because an A-intentional state is a representation of conditions of satisfaction R(E_i, . . . , E_j), for a representation of this state, R*(R(E_i, . . . , E_j)), it must be accepted that R* has conditions of satisfaction E_i, . . . , E_j; for the (A-intentional) identification of an A-intentional state represents its conditions of satisfaction. To counter this analysis, one might object that a suitable reconstruction of the intentional references to R(E_i, . . . , E_j) does not have the form provided above, but the following form: R*(E(R(E_i, . . . , E_j))). The conditions of satisfaction for R* would be those conditions that are fulfilled by the existence of the intentional state R(E_i, . . . , E_j), but these too are the conditions that R represents. In short, if one takes Searle’s fundamental analysis of intentionality as a basis, the attempt to reconstruct higher-level forms of intentionality leads either to an infinite regress, or a duplication of mind, or it collapses because of the concurrence of the referring intentional state and the one being addressed.⁷⁶

If this analysis is correct, then it is not at all clear how Searle, with the help of the representational vocabulary, is to reconstruct those relations between intentional states that are presupposed for higher-order intentional states and thus the basis for any holistic network that is also the foundation for his view of mind.⁷⁷In other words, a mind whose intentional states consist solely in representations of conditions of satisfaction cannot cite itself. If that is true, however, then it is also questionable how the point of Searle’s semantics can be salvaged; for how is mind supposed to be able to consciously link the conditions of satisfaction of an intentional state with those of an expression⁷⁸if this mind is not able to refer to the conditions of satisfaction of the intentional state? With an expression an entity does indeed come into play that, for its part, is not intrinsically intentional; however, the intentionally produced relation between an expression and an intentional state presumes a second mind, which can observe the states of the first one, as the conditions of satisfaction for these states would coincide in one mind. If one would like to avoid a mind in the mind of this sort and thus an infinite regress, then it is necessary to accept the idea of a linguistic division of labor, which externalizes the second mind in the form of an interpreter. In short, in order to reconstruct higher-level intentionality, it is necessary to accept an interpretationist strategy; if we want to get a theoretical grasp of higher-level intentional states (thoughts), we must leave the biological theory of intentionality behind⁷⁹and bring something into play that enables a being to behave toward its own intentional states in such a way that it is able to become a self-interpreter, and this can only be something that is itself not intrinsically intentional, that is perceptible, and that thus can be correlated with intentional states by external interpreters.

This, however, is initially only an outline of a theoretical strategic perspective, and it is by no means clear how one can in detail convey an understanding of the transition from a level of intentionality that can be reconstructed in biological terms to higher-level intentional states. In order to ensure that the task confronting us remains manageable—namely, the task of reconstructing this transition—we can first of all limit ourselves to an explanation that is tailored to the case of a being B with A-intentional states that finds other beings in its environment that are potential interpreters of B’s behavior that, for their part, have higher-level intentional states. What we are looking for in such an explanation is an interactionist theory of the emergence of higher intentionality and of consciousness; on the basis of the genetic character of this theory, we can simultaneously formulate a suitability condition for this to the effect that this theory must be compatible with empirical findings about the mental development of children.⁸⁰When I, in the following, reconstruct the process of the development of higher intentionality in four phases, I am not doing so in order to offer an empirically appropriate psychological explanation of this development; rather, I am attempting to show the plausibility of a theoretical model of this development that does not start from empirical findings but from the conceptual problems related to designing a process of development of higher intentionality. Within the framework of this model, it is sufficient to allow psychological processes to be identified that could serve as the empirical implementations of developmental steps that are postulated conceptually. So, even though the following reflections integrate more empirical material than other parts of the book, they remain committed to the attempt to solve conceptual problems. Here, besides the question of how the development of higher intentionality can be explained, a further question is brought into play: Beyond the admittedly quite formal definition of the identity of nonlinguistic intentional states, how can we explain that such states have content?

3.2.2.3. Back to the Roots

If we aim at an explanation of the transition from basic to higher intentionality for a being B—for example, a small child—under the premise that it interacts with interpreters who already have fully developed intentionality at their disposal, and under these presuppositions, a form of the social “division of labor” is made the basis of higher-level forms of intentionality, then we can initially assume that there is a B-extrinsic mind that can draw on the basic intentional states of B. We then are confronted with the task of explaining how B is able to internalize these references and to achieve them herself independently of the interpreter I.

However, because an interpreter only has reference to the A-intentional states of B via the behavior of B, the center of gravity of such an explanation must be behaviors or (proto-)actions.⁸¹For an interpreter, only such observable activities can constitute an interface for interpretations that employ the vocabulary of folk psychology that the interpreter, because of her higher-level intentionality, has at her disposal. If we start with this intuition, then the following (not particularly dramatic) assumptions are at the basis of our explanation of the development of higher-level intentionality:

(A)

a. A being B, which in the framework of a social interaction process is supposed to be able to establish a higher-level form of intentionality, must be able to perceive its environment with differentiation, to remember, and to learn.

b. At the beginning of this process, B must have (in the last analysis biologically explainable) A-intentional states at its disposal.

c. A-intentional states dispose a being that has them to activities that can be observed by others (proto-action).

d. If the interpreting observers of the proto-action already have higher-level intentionality at their disposal, they can provide intentionalistic explanations of the proto-action of B. (To speak with Dennett, they assume an intentional stance with a view to B.)

Having presumed this, three different theoretical options are available for the construction of the first phase of development, depending on which sort of mental states we take as the starting point for the interpretation process and depending on which sort of mental states we allow to have a paradigmatic character within the framework of the explanation.

The first option, which perhaps I have appeared to favor thus far because of my insistence on the existence of A-intentional states, consists in making, primarily, the ascription of voluntary states the starting point of the analysis; these indeed have paradigmatic status for the characterization of this basic level of intentionality. According to this perspective, in the view of the interpreter, the being B is primarily a being that has (proto-)desires and (proto-)intentions, whereby the interpreter views the proto-actions of B (for example, her pointing action) as a means that serves to bring about voluntary A-intentional states. If we accept this option, the basis of our explanation becomes those interactions in which the interpreter interprets the behavior of B primarily according to the following pattern: B wants_iX.⁸²Or B wishes_ithat X were the case.

A second option arises if the activities caused by A-intentional states are not understood primarily as instruments of desire fulfillment, but as means for representing matters of fact. The interpreter understands A-intentional states in this case as epistemic states, as protobeliefs. If we accept this option, then we make those interactions the basis of our explanation in which the interpreter primarily interprets B’s behavior that is interpreted as demonstrative in accord with the following pattern: B believes_ithat X is there. Or: B knows_ithat this is X.

The third and final option provides the possibility of assuming, at or beside the level of A-intentional states, that B primarily expresses itself expressively, and its activities are to be understood as an expression of basic emotional states, that is, as based on affects. In this case, the interpreter of B’s behavior would primarily interpret it in accord with the following pattern: B feels X. Or: B feels X-like.

One could, of course, object to this division, noting that there is no reason at all not to presume that for the real interpretation of B’s behavior all of these options are employed. From a theoretical perspective, however, the concern is to distinguish which option is suited to serve as an adequate starting point for a theory of the development of higher intentionality; with a view to this question, it appears clear to me that most of the established theories here in fact make a decision that is closely related to their respective conceptions of the intentional. With a view to the earlier discussed models of Searle and Davidson, it is clear that Searle’s model suggests an orientation on voluntary states, and the orientation of Davidson’s theory is based on epistemic states. In Searle, the intention of mind-to-place intrinsic, nonintentional states or activities under the conditions of satisfaction of intrinsic-intentional states serves as the basic model for the development of inferred (and thus higher) intentionality. For Davidson, in contrast, the development of higher intentionality ultimately aims at truth.⁸³

In light of these options, let us once again call to mind the task that we are faced with: in the explanation of concern here, it is first of all important to show how mental states can acquire content; second, however, we are concerned with how the content can become content for those beings that have these states. Searle’s reflections provided for an explanation of theoretical means with the help of which the first part of the task can be handled insofar as A-intentional states can be analyzed in reference to their conditions of satisfaction, but his view lacks the theoretical resources to handle the second part of the task.

If we attempt to identify the theoretical means with the help of which Davidson attempts to manage the second part of the task, then we must assume that Davidson views the model of triangulation as the model situation that should be able to yield an explanation of higher-level intentionality. For with a view to the issue of how a being can behave toward the content of its mental states, Davidson claims this is only explainable if it is possible for a being, in communication with an interpreter and in the presence of objective perceptible states of affairs in the world, to generate correlations between (linguistic) reactions and externally caused stimuli. Because “a creature cannot have thoughts unless it is an interpreter of the speech of another,”⁸⁴a situation is needed in which this being can correlate three classes of states: if a thought is a mental state “with a specifiable content,”⁸⁵then in order to identify the content, a second being is needed that has the faculty of perception and that, by innate similarity responses to a class of stimulus patterns, contributes to the identification of the cause of a thought by reacting to this cause in the same way as the being that is to be interpreted. Triangulation is supposed to secure the fulfillment of those conditions that enable the identification of the normal cause of a (truth-apt) thought. However, a problem with triangulation then appears to be that the ability to think is linked to the ability to understand oneself as the apex of a triangle that includes the classes of similarity responses of two beings and an observable stimulus pattern. This, however, raises the question of whether, with a view to the second part of our task, anything was gained at all; for it is questionable whether the ability to be an interpreter of another being does not presume precisely the higher intentionality that we want to explain.

If the triangular situation is to be an explanatory model for how a being can become an interpreter of the behavior of another being, then it is not clear how a being can understand itself as the apex of a triangle without already having a complex self-description at its disposal. And even if this consequence is too strong, the problem remains that the child in the triangulation must assume that the (linguistic) reaction of the parents to that event in which the two private perspectives converge articulates thoughts. Then, however, the intentionalist vocabulary is already in play; far beyond the assumption that human beings, as members of a species, share perceptual and basic classification patterns, we must assume that children are equipped with competencies of mind reading.⁸⁶However, on the other hand, without these assumptions, which are hardly able to be reconciled with Davidson’s philosophy of mind, I do not see how the theory of triangulation can explain how children acquire those competencies that allow them to become interpreters of the behavior of adults since the first concept that the children would have to have at their disposal would be that of a thought.

In contrast to Davidson’s externalism, I thus would now like to suggest an externalism that does not emphasize the objectivity of the cause of the stimulus that is perceived by the child and the teacher, and, thus accompanying this, the epistemic character of those intentional states that play a paradigmatic role for the development of higher intentionality; this externalism instead argues that the situation of triangulation has assumptions that can be analyzed in a dyadic relation in which, first of all, emotional and then voluntative states acquire paradigmatic character.⁸⁷I would like to develop a multitiered externalism of this sort in the form of a four-phase reconstruction; it sets out in agreement with some results of modern infant research—that emotional states are the starting points of affect-centered communication—and it only introduces voluntative and epistemological states in connection with the structures developed in this first phase. They can, in a manner of speaking, only become the material for the development of higher intentionality if these structures are presumed.⁸⁸Although the theory suggested here is inspired by the results of empirical research, it is important to me to explicitly emphasize that the following reflections are indebted to the development of a reconstructive model and do not claim to be empirically adequate in detail. As a whole, the phase model is a strongly schematized suggestion, which does not claim to do justice to the complexity of the development of thought. An important goal would be reached if, with the help of the sequence of phases, it might be comprehensible how it is possible, under certain social conditions, that beings that only have a basic form of intentionality at their disposal develop into beings that can think.

PHASE 1: AFFECT-CENTERED COMMUNICATION

In the first phase I assume that small children have the fixed genetic disposition to express their basic and inborn⁸⁹affects with a mimetic behavior that is solidly connected to these affects. With a view to the assumptions mentioned under (A), this means that (A) assumes the following broadened form:

(A′) (a)–(d) hold; and

e. B possesses inborn mechanisms that dispose B to react to situations that exhibit certain characteristics with basic emotional reactions, i.e., with affects and specific forms of behavior.⁹⁰

The fact that a child finds itself in an emotional state does not mean—not even under the assumption that it is disposed to an expressive behavior—that it has access to this state in some way that exceeds its mere phenomenological experience. Under this assumption, the child has at its disposal neither the ability to draw on the emotional state with the help of mental states nor the ability to change its states by internal means. Starting from this stage, my thesis now is that a child only gains access to its emotional states by means of interpreters, mostly initially women, who react to its emotional expressive behavior with mimetic responses. These responses first enable the child to correlate its experience of the affect with something externally perceivable; what the child perceives is not just any social reaction, but a reaction that expresses precisely what the child emotionally senses. In short, by means of her mimetic response, the mother mirrors⁹¹the child’s emotional state.

Admittedly, it must be presumed that these mirroring mimetic responses are not mimetic articulations that the child misunderstands_ias expressions of the emotional situation of its counterpart. This is especially necessary since the “regulation of the childish homeostasis and brain processes [is] extremely environmentally and object dependent, and indeed already at a pre-representational ‘pure’ physiological niveau,”⁹²and children, because of this openness, would be brought into the emotional states that they are confronted with in the mimetic expression. If the mirroring expression of emotions was concerned with reproducing the mimetic expression of the child as precisely as possible, then the affects articulated by the child would be reinforced anyway, also in the case of negative effects. It is thus important that the expression “mirroring” is not to be understood all too literally, and that the process it refers to differs from a copying function in the following respects:

1. The mimetic responsive behavior of the reference person does not occur in every case in which an emotional expression is articulated, but only (sufficiently) often.

2. The mimetic responsive behavior of the reference person does not copy the expressive behavior of the child exactly; rather, it is characterized by an exaggerated gesture, which marks the responsive behavior of the reference person and allows it to be differentiated from her own emotional situation.

According to Watson and Gergely, the two characteristics—that the occurrences are frequent but not lawlike, and that the responsive behavior is marked by exaggeration—now play a central role for the differentiation of the internal states of the child; the regular occurrence of the external indicators makes it possible to identify, with some differentiation, internal states that accompany the occurrence of the affect. Drawing an analogy to the technique of biofeedback—a procedure by which people learn to influence bodily states that usually evade the influence of steering (e.g., blood pressure and the resistance of the skin) by making these states perceptible to them through a sensory apparatus such as help tones—Watson has suggested a fundamental learning mechanism that enables children, on the basis of the above-described social feedback mechanism, to perceive their inner states. At the center of this learning model is a mechanism for detecting conditionality, which is triggered especially by frequently, but not always, occurring event–event relations—thus, for example, relations like those we can expect from relations between a child and its reference person. Watson postulates now that this mechanism is oriented to search for those states or activities of the child that, given an external event, maximize the likelihood of that external event.⁹³

Under this presupposition, in the phase of affect-centered communication, a first fundamental step of this internal differentiation occurs; this happens as sufficiently frequent incidents of externally marked, mimetic events enable the mechanism for analyzing conditionality to identify, with sufficient reliability, precisely those internal states that allow the external mimetic incidents to occur. Here, with a view to the inner states, this identification is carried out completely at the level of perceptions; for the occurrences of the marked facial expressions of the parents that correlate to the mimetic expression of the child are just as perceptible by the senses as the inner states and processes that precede the parental facial expressions and can be proprioceptively perceived. In short, sufficiently reliably occurring social responses to the inborn expression of emotional states enable the child to identify those proprioceptive perceptible internal states and processes that are specific to the emotional state. At the same time, however, due to the marking, the “mirroring” acquires an affect-regulating function insofar as—after the analysis of the conditionality—it provides the child, inter alia, with the possibility of gaining a certain control over the facial expression of the reference person, a control that is judged positively and that all too happily leads to a reciprocating smile.⁹⁴

The starting point of this first phase is, first of all, a sufficiently stable correlation between (predominantly mimetic) forms of behavior that both the child and the adult are able and (partially) disposed to engage in.⁹⁵However, in connection with the sensory identification of inner states, this correlation is expanded for the child so that it learns, with the occurrence of inner states of the affective states, to expect the corresponding parental facial expression. For if we can assume that the child has a basic capacity to learn, we can also assume that, when the correlations between its experience and the marked parental responsive behaviors stabilize, the child comes to expect the responsive behavior when it experiences the affects. The way in which these expectations are cognitively achieved—i.e., whether the child forms motoric or visual representations of the responsive behavior—can remain an open question; for the competence to form expectations, we do not presuppose a capability beyond what we expect for higher animals and that we could also describe with the concepts of operant conditioning or within the framework of connectionist models. However, the two elements introduced—i.e., of expectations and representations of the inner causes of the external responsive behavior—are of central importance for further developments. For, on the one hand, these elements can be understood as models for states that can be analyzed in terms of conditions of satisfaction, that is, as models for A-intentional states (see phase 2). On the other hand, under the assumptions that the responsive behavior is engaged in by both communication partners and that the communication can be analyzed in terms of effects, it may be possible to expand the means of communication in line with these effects.

In contrast to most externalist theories of mind, the view that is portrayed in the first phase of the four-phase model does not presume that the original material with the help of which the reference of mind to its states is to be organized is made available to the child by linguistically competent adults or by training; rather, I assume that the child itself brings the initial material to the interaction situation, that is, the types of behavior that serve as means of communication. Here, the connection between the expressive behavior of the child and what the behavior expresses is generated not by an interpretation, but by a genetically anchored disposition. However, with the help of external social responses, the child first learns to perceive what its expressive behavior articulates, which the child is disposed to effectuate; in the course of forming expectations, the internalization of this makes a rudimentary form of self-reference possible, which would be much more fragile if the noted constraints did not underlie the expressive means. As limited as these means are, they form the basis for an increasingly plastic and increasingly rich inventory of means of communication.

Figure 3.1 The development of intentionality: Phase 1

Before I go into these reflections further in the next phase, the scheme of the relations of phase 1 in figure 3.1 should make them more vivid.

PHASE 2: INSTRUMENTAL COMMUNICATION

In the second phase we can broaden the relations that were established in phase 1 such that A-intentional states are integrated into communication. Here we have, in turn, two options. We can assume (like Searle, Dretske, and Millikan) that A-intentional states are biological phenomena, which can be assumed to exist among higher forms of life and can be explained. If we do this, we must then only concern ourselves with integrating them into the already established communicative structures. The second option, in contrast, would consist in placing the existence of A-intentional states in a genetic relationship with the affect-centered communication of phase 1. However, the model that I would like to introduce here is tolerant toward both of these options, and it thus need not make decisions in advance about whether the basis communication of phase 1 is a prerequisite for A-intentionality or not. For there is no reason not to give A-intentional states within the model the same position that the basis affects have in phase 1, as causes of certain forms of behavior of the child. Within the framework of the phase model, we can assume here that the pattern of generating expectations that is established in phase 1 makes it significantly easier to integrate A-intentional states into communication.

If we take the above-provided characteristics of A-intentional states as a basis,⁹⁶then it is clear that we must assume, as in the case of affects, that these states dispose the child to certain forms of behavior. In the case of protodesires, that would be toward forms of behavior that try to bring about states of affairs in the word that satisfy the protodesire. Here we do not assume that the child has access to the content of this protodesire that goes beyond a specific evaluation pattern positively characterizing that state of affairs in the world that constitutes the conditions of satisfaction of the protodesire and negatively assessing a state of affairs in the world that deviates from it. Because, by nature, the means available to small children to fulfill their protodesires are pretty straightforward—the desired things are out of their reach, etc.—there will be numerous activities that fail to fulfill the desires and that will be emotionally assessed correspondingly. If adult reference persons interpret complexes of instrumental activities and emotional evaluations not as resulting solely from their own emotional sympathy, but—because of their folk-psychological competencies—view them as attempts of the child to fulfill a (proto-) desire and to articulate when it has failed to do so, then they can try to bring about the state of affairs that they think is the condition of satisfaction for the nonlinguistic desire of the child. If adults do this correctly often enough, then the activity of the child that is steered by A-intentionality becomes, in the view of the adults, a reliable indicator of the A-intentional state. However, if, in connection with a child’s behavior, an adult frequently enough establishes the conditions of satisfaction for the supposed desire, then the reaction pattern of the adult provides empirical material for the child’s analysis of conditionality, which is related to the material produced by the parents’ mirroring behavior insofar as it allows the child to isolate sufficient quasi-symbolic forms of behavior. Let us assume, for example, that a child wants to be held, that is, it has a protodesire that cannot be fulfilled without the active help of the adult. Then an early form of behavior that is caused by this nonlinguistic desire might consist in an attempt to climb onto the adult, and the articulation of not having achieved this goal might consist in whining. If the adult interprets this behavior to be the result of the aforementioned desire, then with sufficient frequency (though naturally not always), she will do precisely what the child, according to her interpretation, wants. Assuming this, the child’s analysis of the conditionality will be able to identify a leaner form of behavior that is sufficient for the fulfillment of the desire, for example, making noise and stretching out its arms toward the adult. If now this behavior leads the reference person to do what the child wants with sufficient frequency, then the mental representation of this observable implementing action (a mental representation that is anticipated in the course of expectation) indicates precisely the content of the A-intentional state that the child finds itself in. In other words, the mental anticipation of the implementing action represents the conditions of satisfaction of the A-intentional state.⁹⁷

This, however, does not at all completely describe the specifics of phase 2; for, besides the integration of A-intentional states, a further productive problem arises in our model. In contrast to phase 1—in which essentially only the five basic affects serve as causes of the childlike forms of behavior and, via sympathy, the parental reactions—with A-intentional states diverse causes for childlike behavior are added. For the A-intentional states are at least as multifaceted as the things that the child would like to have and the states or events that it would like to bring about. However, this diversity is what makes it necessary that the expressive behavior of a child should be correspondingly sophisticated if the chances for satisfying the A-intentional states are to keep step with the differentiations. However, in the example above, the strategy with the help of which it becomes possible to articulate more differentiated A-intentional states is succinctly sketched out. If we assume that there are numerous situations in which the child remains dependent on the implementing action of adults, then the child’s analysis in cases like wanting to have x, wanting to have y, etc., leads to a reduction of the variety of forms of behavior. There ends up being only one, which has the same outcome in all of these cases: it consists, namely, in arousing the attention of the adult and pointing to x or y.

With pointing behavior, the child has found a form of articulation that stands at the threshold of medial forms of articulation; the practice of pointing can be described as the use of a medium whose medial configurations are composed of a type of arm/hand movement and directions; here, what a token of such a protomedial form of behavior refers to covaries depending on the direction in which the child points. In this way, the articulation is differentiated at the same time that the application context is universalized.

If the mechanisms of phase 2 are established, interpreters can understand the behavior of children relative to their A-intentional states as choices of expressive means, which if performed under the given social conditions, makes it more probable that these states will be fulfilled. But the children do not yet choose these means on the basis of reflection about adequacy, which would imply a second-order relation to the A-intentional states: they have rather a disposition to use an expressive resource, which they have learned to partially control. Now, how can the decisive transition to a higher level be reconstructed? Thus far we have only prepared one element of this higher level, namely, that of socially conveyed representational access to one’s own intentional states. A further attribute of the higher level—namely, the ability to autonomously individuate intentional states—marks the attainment of higher intentionality; this will be introduced in the next phase. Figure 3.2 makes the essential structures of the (earlier) phase 2 somewhat more vivid.

Figure 3.2 The development of intentionality: Phase 2

PHASE 3: MEDIAL COMMUNICATION OR THE DEVELOPMENT OF B-INTENTIONALITY

In the phase of instrumental communication in essence the expressive behavior of children is oriented toward achieving states of affairs in the world that, from the perspective of A-intentional states, have the status of conditions of satisfaction; in contrast, a basic characteristic of phase 3 consists, among other things, in the fact that the communicative acts have conditions of satisfaction that, for their part, are communicative acts. Merely in order to have a name, I initially call this type of intentional state “B-intentional.” What is specific for these states ought to become clear as we pass through phase 3. From the perspective of communication acts that are oriented toward communicative responses, those addressed by such medial acts are not those who implement the states of affairs in the world that A-intentional states are oriented to bringing about; rather, they are beings whose expressive (medial) reactions constitute the conditions of satisfaction for B-intentional states. In a certain respect, phase 3 is concerned with a communication for communication’s sake; here elements of phase 1 will be further developed and elements of phase 4 will be prepared, which, for their part, can be understood as further developments of pointing. However, before this will be comprehensible, an analysis, carried out in several small steps, needs to show how this is possible.

In the first phase, with the help of affect “mirroring,” an initially external representation, and later an internal representation (via expectations), of internal states is developed with the help of which a first step beyond a mere phenomenal experience is possible. In contrast, in the reconstruction of phase 2, we have laid open a self-controlled form of articulation, even if the states that are articulated are not under the control of the child. Under these assumptions, the characteristic of phase 3 can initially be described as a new combination of these elements; here the ability to control expressive resources is not in the service of A-intentional states, but in the service of articulating emotional states, an articulation that is successful if the expression is shared by the addressee. While the basis of phase 2 is constituted by the correlation between childlike expressions and parental reactions that are actions for implementing A-intentional states, the B-intentional states of phase 3 are oriented to those reactions of the parents that, in phase 2, appear as epiphenomena of the interaction. On the one hand, of course, these reactions are linguistic (see phase 4); on the other hand, however, they are also mimetic, gestural, prosodic, onomatopoeic, etc., and they accompany many of the occurring or absent implementing actions. Unlike the reactions that are linguistic in a narrower sense, these articulations primarily aim to produce effects in the children. They are not supposed to be understood like linguistic expressions, but should structure and modify the experiences of children by accompanying aspects of certain situations with medial expressions—for example, by accompanying the lifting of the child with the expression “uha!” or “uupsa!” or by noting when the baby is tired, “ve-e-e-ry ti-i-i-red.” What is interesting about these expressions is not that they—like the last one—can have a linguistic meaning; it is rather the effect of the expression on the child.98 With such expressions the parents in a certain respect interpret the experiences of their children by producing medial expressions whose effects on themselves fit the experience imputed to the children. If we assume that parents perform these expressions—which are formed with the help of (ad hoc) media—with sufficient frequency in the appropriate situations, then we can assume that the children associate types of experiences with types of expressions, that is, they connect the medial expressions with those experiences that have occurred in the contexts in which the medial expressions have been introduced.

A further decisive step for the development of B-intentional states can be made plausible if we assume that children have the natural disposition to imitate the behavior of adults and thereby especially attempt to imitate their medial expressions. If this is the case, then children and their parents exchange the roles of producers and recipients, and the children can expect their performative acts to illicit the symptoms of those effects from their recipients that they normally experience if they perceive the corresponding expressions.

If one takes into consideration the structures of the communication situation that are possible under the mentioned assumptions, then it is striking that the proposed reconstruction exhibits a series of basic parallels to Dewey’s reconstruction of aesthetic communication.⁹⁹As noted, Dewey characterized the specificities of aesthetic communication situations by, on the one hand, labeling the aesthetic production a form of articulation whereby producers try to anticipate its effects on the recipients; on the other hand, he described the reception of a (nonlinguistic) articulation against the background of communicative intentions attributed to the producer. Above all, Dewey’s analysis of the producer’s perspective can be of use to us in reference to a nonlinguistic communication situation between adults and children who do not yet have thoughts; for our scenario indeed allows that the adults play the role of producers with fully developed intentionality. In conformity with the above assumptions, let us suppose the following situation. In the eyes of the adults, the child is making an uneasy, frightened impression. In the face of this, the adult sings the child something calming. Here the adult anticipates the effects that singing in one way or another (quiet, soft articulation, slow tempo, repetitive structure, somewhat ritardando) will (hopefully) have on the child, and she anticipates this reaction by viewing her own standard reaction to singing in these forms as the model. The adult thus uses means that she has available because of her socialization in a musical culture in the attempt to produce the effects in her recipient that these means normally produce in herself.

In a further, but not necessarily temporally later, step, from the perspective of aesthetic production, we can now view the child as the producer: here we of course are not allowed to assume the higher-level intentionality that is to be developed. Under this constraint, there is no problem, however, in assuming that children observe the effects their imitating or spontaneous nonlinguistic articulations have on the adults and correlate those effects—in the case of imitation—with the effects that the parental articulations commonly have on themselves or—in the case of spontaneous articulations—with the effects that articulations have on them.

Even if such interactions may initially be fragile, mechanisms for stabilizing communication in this phase, in which communication is centered on expected effects, can be introduced in the form of two interaction patterns that suggest themselves and that only slightly add to the assumptions required; here, however, they are neutral with a view to the mental competencies that we assume the child has. For, first, we can assume that the adults maintain relatively stable reactions to the children’s articulations. Here we can either assume that the reactions of the adults are relatively stable because of their own socialization history or we can assume that the adults artificially stabilize their reactions in order to improve the chances that the child will be able to form expectations about the reactions.¹⁰⁰Irrespective of how the relative constancy of the reaction comes about, it forms the basis on which the children can establish stable expectations—connected with internal perceptions—regarding the effects of performances on others. Second, however, we can also expect that over time the adults increasingly link their reactions¹⁰¹to conditions for conformity such that their reactions are reinforced if the children make use of means of articulation that the adults (with a lot of goodwill) are able to identify as instantiations of sequences of medial elements of those media that they also make use of in their articulating behavior. If one assumes that children have an interest_iin these reactions (insofar as they confirm their expectations), then we can expect that they optimize their articulations with a view to the benefits of the reactions and in doing so increasingly fulfill the standards that regulate the articulation practices of the adults.¹⁰²

The basic idea of the third phase—with the help of which it is to be explained how children develop the ability to consciously individuate intentional states and in doing so reach a first level of higher intentionality—is to enable them, within a responsive social framework, to become producers of the kind of medial expressions that are directed toward them. Of course, to explain the development of those competencies that children must have available to them as medial producers, we cannot take recourse in the explicit forms by which such knowledge is conveyed; this would assume that the children would already have to have competencies of mature interpreters at their disposal. If we want to avoid this assumption, then we have to assume that children, on the one hand, imitate the medial expressions of their parents, but that, on the other hand, they bring about spontaneous variations in the process of doing so. If, for example, the repetition of the childlike performance by the adult, or other observable “interpretive” reactions such as gestures, is connected with such expressions under the described social conditions, but as regular reactions, then the children can link the performance of types of articulating acts with the expectation that the types of articulating acts with which the parents react will occur.

Following the A-intentional states, B-intentional states are oriented toward external reactions, but they do not aim at those reactions that lead to changes in the world. Rather—following affect-centered communication—they are aimed at changes related to the expressive behavior of the parents. The conditions of satisfaction of B-intentional states are not states of affairs in the world in general; rather, they consist initially in the expressive behavior of the recipient, caused by the expression. With this unspectacular step a new level of development is reached insofar as, above all, expressions now relate to expressions. In this type of a communication situation, a relation is thereby achieved at the level of expressions that, at the level of mental states, constitutes a necessary characteristic of our concept of a higher-order mental state: namely, the relationship of one mental state to another mental state. With a view to the existence of thoughts, it is of decisive importance that this relationship exists within one mind. But in any case, herewith and with a division of labor, we have achieved by means of the proposed reconstruction a form of the relation of expressions to expressions and indeed of expressions that are caused by mental states.

In two concluding steps, I would now like to investigate how these constraints can be sublated in phase 3 so that it is possible for the child to relate a mental state that is the cause of an expression to another mental state, which anticipates—as an expectation—the effect of the expression. To do this, on the one hand, the competence of the child to produce medial expressions, not only spontaneously or in imitation, but also “systematically,” must be developed (this will be discussed in [a] below); on the other hand, so must the ability to inhibit the performance of the expressions (see [b]).

a. In regard to an increasingly orderly production of medial expressions, we can assume that, at the beginning, the parents secure a correlation between spontaneous childlike expressions and their own expressions by imitating the expressions of the child. Little by little, by imitation, the child takes elements of the parental expressions into its repertoire so that at a certain point it approximates the medial expressions. If the parents now make their own responsive practices increasingly dependent on the child expressing itself in a way that conforms to the internal structures of the parental media, and here increasingly also make use of responses that conform to their own media standards, then we can expect the mechanism of conditionality analysis to isolate elements of the childlike articulation practice that more frequently lead to a responsive behavior.

We can thus assume that the ability of the child to construct expressions by combining elements that result in rewarding reactions from the parents is conveyed in the course of analyzing the practical consequences of communicative action until the competencies of reproducing the internal structure have been passed over to the child. With these reflections we have gained a further element for the reconstruction of (nonlinguistic) higher intentionality, namely, stabilized relations between standardized types of articulation and reactions, which form the basis for expectations that connect with inner states and acts of articulation. These expectations are, first of all, primarily related to the behavior of communication partners and not to third parties outside of this dyadic relation. Relations of this sort form the basis for the development of higher intentionality insofar as, in the process of forming expectations, the expressive tokens obtain satisfying conditions, which they acquire solely through and in a communicative process. Unlike Searle, we thus do not need to assume that mind “intentionally” assigns nonintentional phenomena, such as expressions, conditions of satisfaction of intrinsic-intentional states; rather, we can assume that expressions obtain conditions of satisfaction in the form of expected reactions.

If a child can, with variation, actualize combinations of elements of a medium and can form expectations about the reactions that occur among the recipients, it is able to individuate simple B-intentional states. These states have an identity insofar as there are alternatives to them that can be generated by varying medial elements or parameters; they therefore have a compositional identity. From here, it is only a further step to inhibiting the performance, and in doing so, to having an imaginative anticipation of the effects at one’s disposal so as to think a nonlinguistic thought.

b. How it is possible that children learn to inhibit expressions and to perform them in foro interno is a question that cannot even begin to be adequately addressed here; it is in the last analysis an empirical question. In the context of this model, I thus will limit myself to showing that it is plausible that an explanation of this ability need not bring intentional states into play that go beyond the level of B-intentional states. In the end, I thus only attempt to show the plausibility that thoughts, which solely exhibit compositional identity, are not dependent on propositional states to be actualized, and consequently that already at the level of B-intentional states all of the general characteristics of thoughts can be satisfied. The explanation that we are seeking must provide the transition, for example, from the ability to sing something to the ability to merely imagine singing it. If we observe how children learn to read silently, we can assume that this ability is simply acquired as the steering of an activity that was initially only able to be carried out publicly becomes more encompassing in the sense that control over the execution of the activity includes, little by little, the repression of the motoric expression. However, the increasing independence of the child from external addressees is more important than this increase in motoric control: while at the beginning of phase 3 we assumed that, by means of the reactions of the recipients of its expression, the child develops the ability to relate to the effects that the expression has on it, we can now assume that, in the process of forming expectations, step by step the child internalizes the views of external recipients and in this way can correlate medial expressions with the effects they have on it, effects it learned to isolate with the help of the recipients’ reactions.

To what degree does the ability to think nonlinguistic thoughts indicate a form of higher-level intentionality? An initial indication that nonlinguistically individuated thoughts possess a new status is the fact that we cannot characterize these mental states as intrinsically intentional; they do not have their content as intrinsic conditions of satisfaction, but because they create a disposition to behavior that, under certain social conditions, can structure the perception and the experience of the recipients. They do not acquire intentionality because mind transfers the conditions of satisfaction of an A-intentional state to them; instead, they have content that is dependent on a social context because specific medial configurations trigger certain partially observable effects in the recipients (including the producers), effects that, in the process of forming expectations, are linked to the performance of the configurations. To the degree that the child, with the help of reactions to its spontaneous or imitating articulations, manages—with conditionality analysis—to isolate those internal states that are sufficient or necessary conditions for the occurrence of these reactions, its articulations acquire content that is initially determined largely in dependence on the reaction of the recipients. And to the degree to which the child can anticipate this reaction, and thereby can internalize the external observer, the content of the B-intentional state is determined by the experience of the performance of that expression toward which the state is disposed. This content, however, is content for the child; it is accessible to the child insofar as it is able to arbitrarily bring about medial expressions. It can bring about actions that can be generated on the basis of clear, easy-to-remember alternatives and that can be described as productive selective actions insofar as the expressive actions synthesize something whose content cannot be individuated by other means than the identification of a medial configuration. As a medial expression, this provides an offer for structuring the perception of the recipients and replicating it (internally); the recipient is in a position to do these things because of her own medial competence. Herewith it is possible to communicate medial individuated configurations: one and the same B-intentional thought can be thought by different persons just as two persons can have one and the same belief. Unlike the content of a belief, however, its content is as individual as the experience of the medial expression.

The fundamental pattern that is exposed in phase 3 also remains basic for advanced forms of medial communication, which exist in their most developed form as art, even if a part of the sophistication of media and of medial strategies is not conceivable without the occurrence of language. For even the developed medial communication remains related to a level of the differentiation and the structuring of perception and experience, which is developed in phase 3 for the first time. As states with a compositional identity within the scope of possibilities of a medium, B-intentional states are states that could not be individuated by the beings that have them without the appropriate media, nor could those beings understand them within the framework of the reception of medial expressions—that is, replicate them internally—without the appropriate media (see figure 3.3).¹⁰³

Figure 3.3 The development of intentionality: Phase 3

PHASE 4: LINGUISTIC COMMUNICATION OR THE DEVELOPMENT OF C-INTENTIONALITY

With phase 4 we finally reach the level of intentional states whose content has a propositional form; that is, we reach the level of those states whose identity and content can be provided by means of the standard interpretationist theory with recourse to their position in an inferentially structured network. Here it is clear that inferential networks are constituted by certain normative relations, which, in the final analysis, are based on the truth-aptness of many elements of such networks. In the context of these reflections I will not go further into whether the specifics of these C-intentional states ought rather to be described with a Davidsonian or a Brandomian vocabulary. Within the framework of the four-phase model, I can limit myself to explaining how the level of C-intentionality behaves in a genetic respect to the preceding forms of intentionality.

If we explain the specifics of B-intentional states at a fundamental level by maintaining that they are states that have a compositional identity, then C-intentional states can be understood as a level at which medial expressions (or B-intentional states) are linked in a network of normative relations, and indeed the type of normative relations that a radical interpreter assumes if she correlates expressions of a being with states of affairs and other expressions of this being. However, herewith something comes into play at the level of propositional attitudes that opens the primarily dyadic communication of level 3 to the world and thus to objectivity. Relations among medial expressions and experiences of the producers and recipients of the expressions are no longer in the foreground; instead, relations between medial expressions and states of affairs in the world are. While I could take recourse in the communicative structures of affect-centered communication in order to introduce the B-intentional states, the development of propositional states can be explained, on the one hand, with recourse to structures of instrumental communication, on the other hand, however, also with recourse to the mechanisms of medial communication. For basic forms of linguistic communication can be described against the background of the preceding phases as forms of communication in which the reference to the world—which comes to expression at the level of instrumental communication through pointing—is differentiated and explicitly articulated with the medial means developed in phase 3.

The basic idea for the characterization of linguistic communication thus ought to be that some of the introduced medial expressions acquire new roles that are increasingly subject to normative restrictions in the course of the development. Hereby the normatively determined roles acquire a dominant status; this is expressed in the fact that the compositional identity is subordinated to the interpretability of linguistic expressions. Insofar, namely, as an interpreter of linguistic expressions must assume that the speaker fulfills the demands of minimal rationality, she places the expressions in a normative context in which they can play the role of declarations, desires, commands, etc. While the compositional identity of nonlinguistic expressions remains related to experience, in linguistic utterances it comes to the service of the content-avouching power of normative relations that must therefore also come to the fore in the explanation.

If we first of all begin by viewing language as a medium in the sense of phase 3 and, accordingly, viewing linguistic communication as a form of medial communication, then the difference between B- and C-intentional states can be made clear—or each of them, with their related type of expression, can be made clear—if we call to mind a communicative use of language that is caused by B-intentional states. Hereby we would, so to say, assume that, in this linguistic communication, people speak in the same way that they make music in musical communication. A use of language of this sort (which to our ears sounds “inauthentic”) can be clarified in an exemplary way in reference to onomatopoeic linguistic usages, which appear to be enclaves of medial communication from phase 3. For these usages are oriented on experience that is generated by the pronunciation or hearing of onomatopoeic statements by a speaker or hearer, an experience that exhibits structural perceptual similarities with the experience to which the expression is a reaction.¹⁰⁴However, as a rule precisely these relations between attributes of linguistic expressions that are amenable to sense experience and those of expression situations (Äußerungssituationen) are of subordinate importance for the specific possibilities of language; they serve—as in the example of prosody—rather to lessen the ambiguity of meanings that are already in play independently of these relations.¹⁰⁵

Unlike the contents of B-intentional states and of the expressions caused by them, linguistic utterances have content that is prototypically not determined in the relation between the expression and the sense experience of the expression (or the sense experience of the social consequences of the expression), but in other relations. The central characteristic of linguistic utterances, the fact that they can play a role in the game of giving and asking for reasons, is based rather on a relation that is accessible to all participants of linguistic communication, namely a relation between expressive actions (Äußerungshandlungen) and intersubjectively accessible expressive circumstances (Äußerungsumständen). To the degree that the expressive actions are assumed to meet conditions of appropriateness, whose fulfillment can be intersubjectively examined, expressions are subject to normative demands; this is fundamental for their ability to articulate reasons.

If we place these reflections in the context of a genetic perspective, then we can proceed by linking the reference to states of affairs in the world, which already exists at the level of A-intentional states, with medial expressions so that following up the pointing actions of phase 2, the medial expressions are linked with social fulfillment circumstances (soziale Erfüllungsumstände). If we thus assume a child that can perform the pointing actions of phase 2 and the medial expressions of phase 3, and further assume that among its A-intentional states there are certainly some, perhaps many, the conditions of satisfaction of which the child itself cannot actualize, then we obtain an interface for expanding the inventory of pointing actions if we assume that the social environment of the child treats some of its medial expressions as substitutes for pointing actions, for example, according to the following pattern: on the one hand, the parental interpreters establish stable relations between types of expressive actions and actions that achieve the putative conditions of satisfaction of the child’s A-intentional states; on the other hand, they establish relations between types of expressive actions and actions that do not achieve the putative conditions of satisfaction of the child’s A-intentional states. Here the types of expressive action employed by the parents are constituted such that the child can instantiate these types of action by itself. If these prerequisites are fulfilled, the child will develop the disposition to perform an expressive action that instantiates the type of expression that correlates with the implementing action if the child finds itself in an A-intentional state with conditions of satisfaction that it cannot fulfill by itself, but that it can imagine in the form of the expected implementing action.¹⁰⁶With the parental expressive actions, however, at the same time the pool of expectable reactive actions (Reaktionshandlungen) is broadened so that types of parental expressive action can come to indicate types of parental implementing action. To the degree that children manage to perform tokens of parental expressive actions, they manage to articulate the content of the A-intentional states for which the expressive actions indicate conditions of satisfaction. Admittedly—and this is the contribution of the development in phase 3—the types of expressions do not enter into a relation with the implementing actions as monolithic unities, but as medially structured so that modifications of the internal structure can covary with the modifications of the connecting actions (Anschlusshandlungen).

If, in the face of this analysis of Searle’s discussion—that an expression obtains meaning when mind arbitrarily confers to it those conditions of satisfaction that a mental state has—we can then modify this manner of speaking in order to avoid the above exposed difficulties so that we say an expression has content by virtue of the fact that the recipient of the expression confers content to it by reliably reacting to this expression in a certain manner. Here the reliable reactions, for their part, are the basis on which the one expressing herself forms expectations. In contrast to the process of establishing reaction expectations in the context of phase 3, in cases of linguistic utterances, we assume that the expectations are related to reactions that bring about observable states of affairs in the world, or that assume the existence of such states, that is, states that A-intentional states are typically oriented toward. In contrast to the nonlinguistic medial expressions, the fulfillment of the conditions of satisfaction for linguistic expressions is accessible to intersubjective examination. For unlike the medial responsive behavior of the adults in phase 3, which of course also brings about events in the world, the states of affairs in the world that are the conditions of satisfaction for linguistic utterances are prototypically states that can be observed from both communication partners in the same way. Because, on the basis of A-intentionality and the means of medial communication, something comes about that we could call a triangulation, and it is thus possible to connect the use of linguistic expressions that intrinsically lack content to the existence of conditions of satisfaction that can be intersubjectively examined, it is possible to individuate content with normative relations.

The four-phase model that is suggested here assumes that inhibited or implemented medial articulations must have a compositional identity before they can obtain an inferential content. It thus appears that this analysis competes with the Fregean (and Davidsonian) conception of the compositionality of language. For while Frege deduces the structure of language from the contribution of sentence components to the truth functionality of the sentence, here the thesis is that the compositional identity of expressions is more basic than the contribution of the sentence components for the truth functionality of an utterance. It is, however, important to see that, on the one hand, this is concerned with a genetic primacy that is still reconcilable with the theoretical primacy of the sentence, and that, on the other, as a theoretical primacy, remains limited to the context of nonlinguistic communication.¹⁰⁷For the framework of the four-phase model, one specificity of linguistic utterances is that normative relations steer the medial possibilities by subjecting the scope of possibilities of a medium to inferential relations.

The result of phase 4 is a form of communication in which the medial expressions take on roles that are lent to them because they obtain the status of claims, promises, and orders in the context of social interactions. Those medial expressions that have truth conditions or conditions of satisfaction are linguistic utterances, and those states that can be described as inhibiting medial expressions with conditions of satisfaction are linguistic thoughts.

Of course, the model that is proposed here does not claim to be able to explain a linear history of the development of higher intentionality. It is especially important that the impression not arise that every higher level presupposes the completion of the one preceding it; for a realistic view of the development would certainly have to expect that the development of phase-specific competencies are also interlocked with one another. This model would fulfill its goal if it could provide an answer to the question of how it is possible that (biologically sufficiently complex) beings develop a mind under certain social conditions without thereby having to presuppose capabilities at the beginning of the development whose development is precisely what is to be explained and without thereby reducing mind to a phenomenon that is completely able to be causally or functionally described. The model attempts to balance two strategies that Brandom respectively calls “assimilationist” and “exceptionalist.”¹⁰⁸For while, from the bottom up, a continuity with the representationalist vocabulary is sought (the level A-intentionality), for the two stages of higher intentionality the exceptionalist motif that emphasizes the specific difference between thoughts and intrinsic-intentional states is fundamental. Here the content of nonlinguistic and linguistic thoughts is owed to social practices that are erected on natural foundations, but that become unfixed from these to the degree that communicative acts are structured by typified performances that are intrinsically free of content.

3.2.2.4. A More Comprehensive View of Intentionality

One goal of these overflowing reflections has been to develop a view of intentionality and thinking that provides a place for the possibility of understanding thinking as something that is not thoroughly linguistically constituted, without getting entangled in the shallowness of phenomenological introspectionism and without losing sight of the specificities of mind by following a radical naturalization strategy. I have suggested dividing the set of intentional phenomena into the class of functionalistically determinable intrinsic-intentional states and the class of higher intentional states that I also call thoughts. I have related the existence of thoughts, regardless of whether these are linguistic or nonlinguistic, to three conditions that the beings that have thoughts can only acquire socially. The ability of a being to think thoughts is presented, in this perspective, as a capability to internalize processes that are initially actualized as social processes. Here two developmental lines run parallel in which performative competencies and interpretive competencies are developed by internalizing relations between the performances and (interpretive) forms of reaction. In the course of this process, beings that are socialized in a medial practice acquire

1. the ability to individuate performances consisting of medial elements by synthesis;

2. the ability to anticipate the reaction of a recipient to their medial performances; and

3. the ability to inhibit a performance.

While the first condition guarantees that the performance of a medial constellation has the status of an action insofar as, in each performance within a medium, the being also has had socially conveyed alternatives available that could have been chosen from, the second condition ensures that the content-lending correlations of the medial constellation are connected to sufficiently stable social reactions that are, however, in principle malleable and covary with the compositional identity of the constellation. The specificities of the correlations cannot be comprehended in the concepts of natural law or functionalism. Finally, the third condition can only be fulfilled if a being is practiced in medial performance practices so that routines emerge for performing medial elements (and constellations). The following conditions that come about in the process of socialization form the background of this reconstruction: the standardization of expressive behaviors, the standardization of reactions to norm-conforming expressive behaviors, as well as the varying and consequently individualizing of the expressive behaviors within a norm-conforming framework.¹⁰⁹If we assume the existence of these capabilities, then the following basic definition of thinking can be provided, which is not limited to linguistic thoughts:

(ND) Thinking is the individuating of an inhibited, medial performance within the scope of possibilities of a medium, whose effects on potential recipients of the performance are anticipated.

Every mental state that is individuated in a manner sufficient to (ND) is—unlike A-intentional states—a higher mental state that has content and an identity for the being that individuates such a state. Such a state has an identity for this being insofar as it must be able to bring about this state using a mental operation that can be described as an inhibited medial performance. For that being, each of these states is thus a state for which there are alternatives relative to which the existing state has an identity. Such a state has content insofar as the being that individuates it assumes an interpretive relation toward it which prototypically is an anticipation of the (medial) reaction(s) of recipients of the medial constellation, but it can later also assume the form of the reception by the producer.¹¹⁰If this analysis is plausible, then we can say that besides linguistic thoughts, nonlinguistic medial mental states are also one kind of higher-level intentional state. For both types of states can only be individuated with the help of means that, for their part, are not already intrinsic-intentional and that must be kept in existence¹¹¹in a social interpretation practice.

Before I finally go into the question of whether nonlinguistic thought can be understood in this sense as second-order intentional states, and thus fulfill Davidson’s second-order criterion, I will provide a summary of the types of intentional states that I have established in the preceding reflections as theoretical entities of an expanded interpretationism. I propose differentiating between three types of mental states with content:

(AI) A-intentional (intrinsic-intentional) states

a. The ascription of A-intentional states is legitimate from the perspective of the interpreter if the interpreter’s explanation of a particular behavior must assume that the being that is to be interpreted (B) has at its disposal nonlinguistic representations not only of existing states of affairs of the world, but also of nonexisting states of affairs, and these representations are the causes for B’s behavior. A-intentional states dispose the beings that have them to differentially respond to their environment. Here the differentiating behavior must be such that it orients itself in reference to internal criteria_iso that these criteria_iprovide an explanation of the behavior in the sense that they identify a state of affairs in the world and the behavior can be understood as bringing about exactly that state.

b. The A-intentional states befall the beings that have them. A being whose most developed mental states are A-intentional states does not have the possibility to refer to these states. In other words, the content of A-intentional states is never another intentional state. If a being finds itself in an A-intentional state I_a₁, there is no reason for a change to a state I_a₂; there are only causes.

c. A-intentional states are intrinsic-intentional, that is, they are not intentional by virtue of a (self-)interpretation. They have content that an interpreter can rightly ascribe to them on the basis of the behavior of B, but they have no content for the being that has the ascribed A-intentional states.

d. A-intentional states occupy a position between those mental states that we call perceptions and those mental states that we call thoughts. They share with perceptions the intrinsic relationship to a content that we can imagine is conveyed through functional mechanisms; they share with thoughts (B- and C-intentional states) the individuation in relations that are characterized by rightness or appropriateness, but that in the case of A-intentional states can be completely naturalized.¹¹²

(BI) B-intentional (“medial intentional”) states, in contrast, are states that beings impute content to when, in performing the acts that these states dispose them toward, they assume an interpretive relation toward them.

a. B-intentional states assume that behavioral alternatives are available; here, these behavioral alternatives are not thought to be intrinsic-intentional.

b. A B-intentional state I_bis individuated in such a way that this individuation can be steered by B insofar as B is able to implement constellations of intrinsically empty behavioral alternatives (and to inhibit the performance of these).

c. I_bhas a compositional identity that provides its position in the scope of possibilities of a medium.

d. I_bhas content for B because B correlates I_bwith another mental state, prototypically of a cognitively implemented expectation with regard to the behavior of a recipient (which can be B itself) or an experience.

(CI) C-intentional (propositional-intentional) states are those B-intentional states whose contents are truth-apt for the being who has these states.

A map that I would recommend for orientation in the zoo of mental states would thus look schematically something like figure 3.4.

With the preceding “revisions” in the philosophy of mind, it should be shown to be plausible that we can expect intentional states that are thoughts insofar as they have content and an identity for the being that individuates these states. In a concluding reflection I would now like to examine whether these nonlinguistically individuated B-intentional states fulfill the second-order criterion that Davidson views as foundational for the existence of thoughts. Let us recall: Davidson claimed that only those beings can have thoughts that have the concept of thought at their disposal. In doing this, Davidson fixed on a concept of thought that implies a second-order relation such that the being that has a mental state has a thought if and only if it has thoughts about this state—if it, for example, believes that this mental state is true or false.

Davidson’s reconstruction of thought as a second-order intentional phenomena has a double function within his theory. On the one hand, the second-order relation guarantees that the contents of thoughts are the products of (self-)interpretation of mental states. On the other hand, this move vouches for the possibility of assuming an antirealistic position about the existence of thoughts insofar as the second-order relation ensures that thoughts can be apprehended as theoretical entities of (self-)interpretation. Even if one can take a relaxed stance about the ontological question,¹¹³a position that understands itself as expanding interpretationism must clarify whether there is a higher-order relation that can articulate the content of B-intentional states without thereby having to take recourse in the interpretive resources of a (meta-)language.

Figure 3.4 Types of mental states

In principle, two perspectives are available to us to investigate this question. For one, we can question whether it is possible, by means of a nonlinguistic medium M₁, to refer to a B-intentional state that has been individuated in M₁, and indeed in the way that we can refer to a linguistically individuated mental state by means of language. For another, however, we can also examine whether it is possible to articulate the content of an M₁-individuated B-intentional thought by means of a nonlinguistic medium M₂. In both cases we would be able to examine whether nonlinguistic media provide resources with the help of which it is possible to refer to B-intentional states in such a way that their content is articulated.

Because the only medium needed to specify the meaning of a linguistic utterance is the language in which the utterance was generated, and we can thus use any natural language as a metalanguage, the possibility, by means of language, of providing the content of expressions in this language appears precisely to be the formal attribute that makes language a model case of a medium that allows second-order relations to be implemented. If we accept that metalinguistic ability is the standard for the possibility to develop second-order mental phenomena—thus thoughts—then we can only justify the view that nonlinguistic mental states are coequal to these if we can show that nonlinguistic media can also be applied meta-medially. Put concretely, can we, for example, refer to music by musical means or refer to painting by the means of painting and in doing this interpret the constellations to which we are referring?

Initially there appears to be no special difficulty connected with the idea that it is possible, for example, to refer to music by musical means; for imitation, citation, and satirizing are established aspects of musical practice. If, for example, we accept Goodman’s analysis, we can see that we by no means need exaggerated analogies in order to explain the possibility of citing with the help of nonlinguistic media. From the analysis of linguistic citations, Goodman gains, first of all, two necessary presuppositions for the possibility of citation. Citing must primarily solve two problems: it must include what is cited (in the form of a paraphrase), and it must ensure the reference to what is cited either by naming it or by denoting it through predication.114 Criteria for direct and indirect citations can now be provided by modifying these necessary conditions: while the direct citation must denominate and incorporate the cited material, indirect citations can also secure the denotation not by explicit denomination, but by predication and guaranteeing the incorporation of what is cited by (nonidentical) paraphrasing.¹¹⁵For music and painting, which Goodman discusses as examples of nonlinguistic media, it is indeed necessary that constraints for the possibility of citation are accepted, but the principal possibility of citing is not called into question by these constraints. So in both cases, analogues to the quotation marks of language can be found or thought of.¹¹⁶Here, in the case of painting, one must account for the fact that there can be no replications in the sense in which different inscriptions of a word are replications of the word, because works of painting are always unique.¹¹⁷As a result of the lack of an analogue to the alphabet and the lack of criteria for the determining that the citation and what is cited are the same, in painting there is no exact analogue to the direct linguistic citation. A theory of musical citation, in contrast, must account for the fact that music usually denotes nothing at all. For this reason, in the framework of music the problem is not with direct citation, but with indirect citation insofar as it is difficult to provide criteria for a musical paraphrase. But the problems of musical citation are restricted to difficulties that can be limited to a certain way of incorporating what is cited into a citation, and these problems do not obviate the possibility of direct citing. Considering these constraints, there is no reason not to assume that in nonlinguistic media it is also possible to refer, by means of a medium, to entities and to those thoughts that are generated by their help.

With a view to the problem of nonlinguistic second-order relations, Goodman’s reflections throw light on the fact that it is possible, with the help of nonlinguistic media, to refer to nonlinguistic thoughts insofar as a reference can be organized by means of these media that can be reconstructed as an exemplification of the attributes of a medial constellation. It remains questionable, however, whether this reference plays a constitutive role for the existence of a B-intentional state. For the fact that this reference is possible does not mean that it is necessary for the existence of a B-intentional thought in the same way as having a belief about a propositional state is necessary for the existence of a thought. Davidson apparently links the existence of thought to the following criterion:¹¹⁸

(T) P has a thought T if and only if

1. P believes (desires, etc.) S [formal: (T(P, S)] and

2. P believes that P believes (desires, etc.) S [formal: T(P, G(P, S))].

Because P, however, can only fulfill the condition (T)(2) if P has the concept of belief at her disposal, only those beings can have thoughts that have a language that provides the conceptual means for reference to mental states. If one now confronts B-intentional states with the criterion (T), then the following options are available:

1. We reject criterion (T) or a reformulated variation of (T) for nonlinguistic thoughts.

2. We admit that B-intentional states are only thoughts if a person who thinks them has linguistic beliefs about these states.

3. We attempt to develop an equivalent to criterion (T) that can be actualized with the help of nonlinguistic media:

a. by making medial forms of reference to B-intentional states the presupposition for the existence of nonlinguistic thoughts; or

b. by identifying a general attribute of B-intentional states that implies a certain (perhaps weaker) form of a second-order relation.

What is clear at the outset is that the first option has very high costs, for in opposition to the claim that B-intentional states are mental states for the being that has them, we must clearly show how this is possible without criterion (T) or a variation of it being fulfilled in her mind. In a certain way, (T) includes the nucleus of an interpretationist theory of mind, that is, a theory that explains the being-for-a-being of a mental state using concepts of self-interpretation or self-ascription. The first option is thus simply incompatible with the interpretationist foundation of the theory of conscious medial thoughts proposed here. The second option appears to be too defensive insofar as it falls back behind the level of the reflection on the content of nonlinguistic thoughts; after all, these reflections have shown that the content of a medial constellation in the form of an experience that is connected to carrying it out is a content for the being that carries it out. However, we can view the second option as a fallback option for the case in which it can be shown that every form of self-interpretation of mental states is connected with the availability of concepts. We must then accept the idea that B-intentional states are indeed intentional states that a being itself can practically individuate, but whose identity (and content) can only become conscious if identity and content become the object of a propositional attitude. However, a fallback option should only be taken up if it is clear that the attempt to develop an equivalent for the linguistic form of reference to mental states entails insurmountable problems. Let us then first of all consider the third option in its two variants.

In the context of its genetic reconstruction, I said that medial expressions (and their B-intentional causes) have content insofar as they are connected with experiences that accompany the performance of medial constellations. If we compare this constitution of content—which is a content for the performing being insofar as it is this being that has the performance-dependent experiences with the relation that forms the basis of (T)—then the following difference stands out: while, according to Davidson’s criterion, consciousness is connected with the fact that a being refers to a propositional state by means of concepts, the content in the context of introducing B-intentional states is ensured by something that indeed is caused by a medial performance (namely, the experience), but not by the fact that a medial performance is interpreted. Thus the question of concern here is not really whether to cast doubt on whether B-intentional states have content, but whether beings can articulate this content by nonlinguistic means so that the content in the perspective of the self-interpretation is content for the being that finds itself in a B-intentional state. According to interpretationist premises, nonlinguistic thoughts would have content in a formidable sense for the being that has them only as interpreted intentional states, and it is clear that linguistic interpretations are model cases of such self-interpretations. If then there is to be an interpretation of (T) that is befitting to nonlinguistic thoughts, this criterion (NT) must connect the existence of a nonlinguistic thought to the existence of a nonlinguistic interpretation, and in fact, more precisely, to the existence of a nonlinguistic interpretation of a B-intentional state.

What reasons are there not to view the correlation of two nonlinguistic thoughts as an interpretive relation? What reason is there, for example, not to say that a red-chalk drawing interprets a watercolor or that a dance interprets the content of the piece of music to which it is performed? But it remains questionable—even if this is admitted—whether the possibility of the existence of an interpretive relation between two nonlinguistic thoughts can provide a sufficient basis for the reformulation of (T). This would, namely, take something like the following, unsatisfying form.

(NT₁) P has a nonlinguistic thought M₁if and only if

1. P thinks M₁; and

2. P thinks M₂; and

3. P thinks that M₂articulates the content of M₁.

Apart from the fact that a propositional state appears in (NT₁)(3), and (NT₁) could thus not be fulfilled by beings who do not have a language at their disposal, the criterion does not accomplish what (T) does. For (T) not only prescribes that a being must be able to articulate the content of a mental state if a thought is to exist; rather, it prescribes that it also must be able to ascribe this mental state to itself. However, if B-intentional states have no predicative structure, this means that, with the help of a nonlinguistic state, no predication can be carried out in which a being ascribes to itself a mental state. If the interpretive relation is not fundamental in (T), but a self-ascribing relation, then it is clear that this relation cannot be produced by nonlinguistic means. And then it is also clear that Davidson’s criterion (T) in essence is a much stronger reformulation of the Kantian view that “the I think must be capable of accompanying all my presentations.”¹¹⁹In contrast to Kant, Davidson would claim that a mental state M is a thought if and only if M is accompanied by the “I think.”¹²⁰If we assume this interpretation, then the consequence is obvious: it must be accepted that there is no direct nonlinguistic equivalent for the “I think”; insofar as this is the case, B-intentional states are only thoughts if a being ascribes to itself a B-intentional state with the help of the idea “I think,” which can only occur with language. In contrast to this, Kant’s formulation of a more liberal way of speaking of this would make it possible for B-intentional states to be thoughts, because they can become objects of self-ascription with the help of the “I think,” regardless of whether they are this for contingent reasons or not.¹²¹There is no reason not to say that thoughts like the following are possible:

Regardless of whether one assumes the liberal interpretation (with which I sympathize) or the stricter one, the preceding reflections appear to show that B-intentional states are thoughts, understood as self-ascribed—and thus conscious—thoughts, if the being that has B-intentional states refers to these—or can refer to these—with the help of a self-ascription that can only take place with the help of language. Here, however, it should not be forgotten that the “I think” only articulates a difference that this being can make, also without language, namely, the difference between its own states and the states of other beings.

In conclusion, let us examine the option of formulating (3b) as an equivalent for the criterion (T) with the help of general attributes of medially individuated intentional states. An attribute that all B-intentional states share is that—in contrast to A-intentional states—they are states that a being can actively individuate. However, such a being must have a consciousness of the optionality of every B-intentional state. A being that does not know that there are alternatives to every medial performance M does not have a B-intentional thought that is, at the same time, the cause for the performance of M. What reason is there not to view this consciousness of the optionality of B-intentional states as the general attribute, which fulfills the second-order relations in the required manner and then correspondingly to advance the following criterion:

(NT₂) P has a nonlinguistic thought M if and only if

1. P thinks M; and

2. P knows that there are alternatives to M that P could think.

As an objection to (NT₂) one might point to the fact that demand (2) brings a form of knowledge into play that is propositionally differentiated so that (NT₂) increases the dependency of having a thought on having a linguistic thought. Because it is difficult to see how consciousness of the optionality might be implemented in P independently of linguistically articulated knowledge, to retain the idea of (NT₂), only an external ascribing perspective remains available to us:

(NT³) P has a nonlinguistic thought M if and only if

1. P performs a medial constellation K.

2. Interpreters of the medial performance of P can only understandably explain the performative practices of P by assuming that P individuates her B-intentional states by actualizing constellations within the scope of possibilities of a medium (and in doing so partially exploits these possibilities).

3. The performance of K by P is caused by a B-intentional state to which P had alternatives, as her performance shows.¹²²

The following results from our reflections regarding the ability to satisfy the second-order criterion, understood as a self-ascription: self-ascriptions are connected to media in which one can say “I” and thus can articulate a self-reference that is indeed possible at the level of prelinguistic and nonlinguistic thought, but at that level it cannot be the object of a belief or of a thought. We can indeed describe the identity and the content of a nonlinguistic thought as an identity and a content¹²³for the being that has this thought, but we can only articulate the belief that it has this thought by employing linguistic means. Anyway, this at least is the preliminary conclusion: nonlinguistic thoughts can become the object of a self-ascription so that—if one follows Kant—the status of thoughts cannot be denied to them as long as we do not identify consciousness with self-ascriptions that are carried out.

3.2.2.5. Back to the Concert

Based on the difficulties of developing an adequate understanding of artistic action by means of the established theory of action, I have taken to developing a more comprehensive view of mind, employing the means of media theory; this is a view in which nonlinguistic thoughts also have a place. Now it is time to examine whether, by expanding the theory of mind with the help of media theory, a conception for understanding artistic action can be developed that is reconcilable with two basic intuitions, namely, that works of art are not translatable and that they are understandable. To do this, it is necessary to address the question of the adequate form for rationalizing artistic actions as well as the question of the content of nonlinguistic artistic thoughts: for, on the one hand, the gap that was left by the rejection of the common rationalization schemata (H1) must be closed (this is to be taken up in [b] below); on the other hand, it is necessary to examine whether the analysis of nonlinguistic thought employing the concepts of compositional identity does more than provide a technically bogged-down theory ([a] below).

a. The fact that the identity of a B-intentional state can be explained by media-theoretic means can indeed be viewed as an explicative advance; for in any case we have an identity criterion for nonlinguistic thoughts available. On the other hand, the comparison with propositional attitudes immediately shows that an identity criterion alone is hardly able to account for our expectation that a thought—be it linguistic or not—has content; for even if the identity of a propositional attitude can only be explained in reference to ideas of its inferential position, the entire network of inferential relationships cannot be created without the existence of mental states that have a noninferential content.¹²⁴If the concept of nonlinguistic thought is not to remain pallid, the analysis cannot merely consist in providing identity criteria. Rather, precisely in the context of understanding artistic action, it must be made clear that these thoughts are about something. If the explanation of the structural attributes of works of art did nothing more than analyze their medial identity, it could hardly be plausibly explained why works of art can move us in such specific ways. So, what are B-intentional states concerned with?

If we call to mind the position that B-intentional states assume within the framework of the genetic reconstruction of the four-phase schema, we begin to see the contours that an answer to this question will be able to take. For in a certain respect, medial expressions, which are caused by B-intentional states, take on the position of those communicative, expressive behaviors that articulate affects at the level of affectual communication. However, while the communicative role of the emotional expressive action is ensured by a genetically fixed relation between affects and expressive behavior, the possibility for medial articulation opens up a space for expressing something that surpasses the experience of basic affects, but that is nonetheless an experiencing or perceiving. To determine this experience two perspectives are available: for, on the one hand, it is possible that a person finds herself in a state that disposes her to an (inhibited) performance of certain medial configurations; on the other hand, however, the (inhibited) performance itself can occur as an object of an experience that can be communicated with the help of a medial configuration.

The basic idea for determining the content of a medial constellation, or for determining the corresponding B-intentional state, claims—in agreement with the hypothetical history of its origins—that the function of medial communication consists in the fact that the producer of a medial constellation makes an object of her own experience accessible to the addressee; in this way, it becomes an object of the experience of the recipient. Here, two situations can be distinguished. The producer has an arbitrary experience E₁and tries to generate a medial constellation that causes an experience E₂in the receiver, namely, an experience that is sufficiently similar to E₁. If this is successful, then the producer can view the medial constellation as a means of sharing the experience E₁. Another possibility is for the artist to use the medial possibilities of a medium, while in the course of experientially varying and recombining them, to create an object of experience that is of interest, independently of its specific preceding experiences. In both cases, it holds that

(M7) Media are means with the help of which experiences can be communicated.

Unlike the linguistic descriptions of experiences, in the course of individuating the objects of experience, media allow experiences to be communicated. If this reflection is used to determine the content of a medial constellation, two preliminary definitions result: one is connected to the experience that produces it, and one is more general, giving up the connection to the producer.

(M8) The content of a medial constellation is the experience that the producer connects to its realization.

(M9) The content of a medial constellation K is the experience that the reception of K brings about in its recipient.

If we, in turn, use these determinations in order to define the content of a medial-intentional state, we can establish it as follows:

(CB) The content of a B-intentional state is the experiences that cause the inhibited performance of the medial constellation that the B-intentional states dispose us to perform.

Against the background of these definitions, the portrait of medial communication that we are sketching out describes media as means of communication that open the possibility to overcome the exclusive privacy of experience by structuring these objects on the basis of intersubjectively established alternatives so that the experience can obtain the function of a nonlinguistic interpretation on the basis of those competencies that allow the interpreter to structure the experiences that she has by perceiving medial constellations in line with those intersubjectively shared medial competencies; this is done by reproducing its medial structure. Here it is of decisive importance that this object of experience exhibits a medial structure—i.e., it has a synthetic character—that shows that this structure can covary with the experience and is not an opaque monolithic object. In short, medial communication is the attempt to avert the impossibility of conveying phenomenal experience; this is done by generating constellations—with the help of media—that, as objects of the addressees’ experience, should enable the addressees to have an experience that is similar to that of the producer in relative ways if the recipient structures the object in the same way as the producer.¹²⁵

b. Applying this to the problem of the rationalization of artificial actions, the preceding reflections allow a schema to be formulated that, on the one hand, makes it possible to describe artistic activities with the help of a practical syllogism as activities caused by reasons; on the other hand, however, it does not prescribe that each of these reasons be interpreted as an intentional state that is, eo ipso, a propositional attitude. If we can expect that B-intentional states are able to occur as parts of the rationalizations of action, then rationalizations of artistic actions can assume the following form:

(H2)

a. With the help of the possibilities of a medium M₁, P₁individuates a B-intentional state I_b(K);

b. With the help of the possibilities of a medium M_n, P₁generates a product whose identity in M_nis subject to the same—or sufficiently similar—conditions of identity as I_b(K) in M₁(n ≥ 1);

c. P₁wants to express I_b(K);

d. Thus: P₁makes K accessible for another P_n.

Before I evaluate (H2) with a view to the problem of understanding artistic action, I would like to quickly comment on the premises of the syllogism.¹²⁶Premise (a) states essentially that “P₁thinks a nonlinguistic thought by anticipating and inhibiting the performance of a medial action.” At the same time, premise (a) ensures that (H2) is connected to the intuition that works of art are not translatable; for although in (a) only the individuation of a nonlinguistic thought is discussed, the genetic context for introducing B-intentional states places the inhibited individuated medial constellation in the context of those effects that a perceptible actualization of this constellation could have on recipients. However, for the implemented medial configuration that is externally perceptible, it holds, then, that it can have effects that are in an elementary sense connected to the implementation of the configuration in a specific medium, and indeed because these effects do not occur in the recipients’ perception in another media. Insofar as the effects on the recipients are intractably connected with the means by which they are generated, all “translations” of the work of art that do not bring about these effects fail. Premise (b) should clearly show that P₁does not stop with the thinking of a nonlinguistic thought, but precisely implements the action that would be inhibited in (a). In this, premise (b) anticipates two possible cases: P₁could implement those medial actions whose inhibited execution identifies the nonlinguistic thoughts in (a) so that the individuating medium and the implementation medium are identical, or another (nonlinguistic) medium could be used whose scope of possibilities allows the implementation of a compositionally sufficiently similar medial configuration. Premise (c) functions to connect the nonlinguistic thoughts with a desire or an intention that is necessary for the practical syllogism. Thus (c) contains a reference to the nonlinguistic thought I_b(K). However, this reference does not imply the linguistic reconstruction of the identity conditions of the thought I_b(K); rather, it can be analyzed as a deictic form of reference or as a sufficient characterization.

According to (H2), the action character of an artistic activity now no longer depends on ascribing persons with intentional states, which can in principle be stated linguistically; rather, we can now expect that it is possible for people to have intentional states individuated by means of nonlinguistic media. On the basis of their identity in their medium, we can refer to them by means of another medium (often only with constraints). The process of individuating works of art can now be described as a production process. Within this process, the product emerges through steps in which choices are made from an inventory of intersubjectively accessible possibilities. Because the intention is able to be individuated—even for the producer—only in dialogue with the specific possibilities of the medium being used, language does not necessarily come into play in the individuation of the intention of the work of art, as long as the work of art in question is not itself linguistic. In contrast to (H1.1), (H2) only contains the formulations of propositional attitudes in premises (b, c). In premise (H2)(c), reference is indeed made to I_b(K), but this reference can only be guaranteed after I_b(K) is already individuated in an artistic medium. Besides this reference to the B-intentional state, which is determined in the process of producing a work of art, the premise, however, also ensures the ability to connect to the folk-psychological understanding of action explanations, which remain obtuse if they fail to refer to desires or intentions.

In an important respect, (H2) can be read as emphasizing the communicative dimension of artistic action without thereby conflicting with the intuition that works of art are untranslatable. Beyond that, however, (H2) also allows us to specify Dewey’s reconstruction of artistic communication. Dewey interpreted the specifics of aesthetic communication as follows: P₁produces a perceptible phenomenon K with a view to the reception of K by other persons P_n(n ≥ 1), and a person P²receives K with a view to its having been intentionally produced by P¹.¹²⁷Now it is clear why Dewey came to his view that there is a double determination of the intrinsic relationship¹²⁸between the means and the ends of expressive actions. For media are of necessity means for individuating B-intentional states insofar as their compositional identity is connected with the respective media; on the other hand, media are means that stand in an intrinsic relationship to goals insofar as the product consists of these means, because, as a product, it is the implementation of a medial constellation. Under the premise that the medial communication serves the communication of experience, it is now, however, also clear that we are not dealing with two forms of intrinsic relationships here, but only with one: for as a means for individuating a nonlinguistic thought, which is the inhibited form of a medial performance, media are, at the same time, the means that structure the performance that can be experienced.

In addition, however, with the help of these reflections—extending beyond Dewey’s accomplishments—we can plausibly demonstrate why a form of aesthetic communication that is implemented by medial means shows any promise of success whatsoever and why (H2) is a rationalization of action insofar as an artist can expect to be able to achieve her communicative goal by publishing or performing the work of art. For if we can assume that the recipients share the medial competencies of the artists to the degree that they are able to medially structure the produced object and thus are able to replicate it in accord with medial differentiations, then the probability increases that the replicating structuring of this produces an inner experience that is similar to that of the artist. I will return to these problems in chapter 4.

3.2.3. Problems of Field Research

In the previous section I have attempted to clearly show how the media theory proposed here can be applied to reconstruct processes of understanding nonlinguistic expressions (in reference to the example of artistic expressions). Here I have assumed that both the interpreter and the producer have a completely developed interpretation language and socially developed medial competencies at their disposal. In the second step, which now follows, I would like to investigate what media theory might add in the situation of radical interpretation, thus in the “prototypical situation of contemporary analytic philosophy.”¹²⁹The situation of radical interpretation here provides us particularly with the chance to examine whether the concept of compositional identity is an independent concept, which can be explained without reference to the Fregean perspective of truth functionality; for Frege, and following him, Davidson, have claimed that the syntactic structuring of the object language is precisely a product of the analysis of the input that linguistic elements provide in determining the truth function of the sentence that is to be interpreted.

By entering into a situation of (sufficiently) radical interpretation, we, however, enlist in a situation in which it is not clear whether those being interpreted speak a language; we, however, hold firmly that the interpreter is in full possession of an interpretation language. Let us, for example, observe the situation of an ethnolinguist who, within the framework of a prestigious research project, has to investigate an ethnicity that was, until recently, undiscovered. According to the working hypothesis of the ethnolinguist, the members of the Mulang speak a language in which, apparently, numerous important differentiations are made by articulating the few syllables of the language at various pitches. This fact does not irritate the researcher much initially, for he also knows such techniques from Chinese. However, because the number of syllables is quite small, quite a lot is dependent on the reception of pitch. In the hope of better establishing his hypothesis, for which he initially has formulated a few empirical indices in the form of T-theorems, the field researcher requires musicological help from his home university. The ethnomusicologist who was hurriedly flown in, however, realizes after only a few days of intensive observation of the Mulang that her colleague has been mistaken. In truth, what the ethnolinguist understood to be a language is nothing other than singing. Shocked by this diagnosis, and mindful of the potential damage to his reputation as a researcher, the linguist first assumes that his colleague is inexperienced and mistaken. The musicologist, however, confronts him with the following finding. If one analyzes the pitch sequences exactly, the following somewhat surprising view emerges: the sequences are organized in quasi-eight-tact phrases, which are organized cyclically; here a correspondence between the first and the fifth “tact” are quite frequent. Beyond this, over a long period of time, systems of ordering such sequences can be found. With their help, the sequences are often organized into complex rondo-like schemata. All of this escaped the colleague’s notice, above all, because the Mulang use a quite small-step tone system—with intervals that are considerably smaller than our halftone steps—which easily escapes our notice.

However, the ethnolinguist, who thinks he can see what this is all boiling down to, now waves it aside. What the colleague is presenting in this analysis does indeed exhibit a certain similarity with the structuring of an expression in the object language through a metalanguage, but beyond identifying syntactic relations, this structuring also semantically identifies the elements of the object-language expression that are integrated by the syntactic relations. And, according to the ethnolinguist, there simply is no equivalent to this semantic identification in the ethnomusicologist’s analysis, which by the way is completely arbitrary because it imposes the concepts of the European music tradition on the alleged singing of the Mulang.

The musicologist admits that the attributes of the Mulang’s singing, which she has mentioned in her analysis, make this objection obvious, but she thinks the irritation can be cleared up in the following way: the elements from which the singing of the Mulang is comprised can be impeccably identified in our staves, which must take on the function of a meta-“linguistic” translation medium; thus far she has found no token that could not be noted—at least with sufficiently good ears—with the help of our staves system. These comments do not really improve the atmosphere; the remark about imposing the concepts of the European music tradition has particularly agitated the musicologist. After all, in her view there is in principle no difference between her procedure and the procedure of the ethnolinguist; for just as the linguist reads his rationality standards and beliefs into the expressions of the Mulang, she hears the familiar musical structures there.

For an outsider it is perhaps easier to see that the musicologist is right about this; for there really is an interesting correspondence between the procedures of the researchers. Both researchers use a medium for their interpretations with the help of which they structure the articulation sequences of the Mulang.

Fully indebted to his Davidsonian education, the linguist attempts

1. to identify linguistic unities, which he holds to be truth-apt, as well as their components;

2. to correlate expressions that can be formed from these elements with metalinguistically expressed truth conditions, i.e., to formulate T-theorems for the types of expressions;

3. to collect evidence for the T-theorems by searching for expression situations in which the hypothetical truth conditions are fulfilled;

4. to discover covariations between the compositional structure of the linguistic unities and their truth conditions.

The musicologist, in contrast, attempts

1. to identify medial constellations and their elements;

2. to correlate the expression of those constellations with expression situations that she views from the perspective of her own experience of these situations;

3. to find evidence for her correlation hypotheses that indicates that certain songs are only performed under certain hypothetical, individual psychic, social, or natural conditions.

But the parallels of the researchers are more far-reaching, for they employ a medium in their interpretations of the Mulang, with the help of which they ascribe intentional states to them. In complete conformity to the above analysis, to do this they use the possibilities for differentiation, which make available to them an already known medium that, in the framework of their interpretations, plays the role of a “meta-”language. For there is no dissent among the two colleagues about the basic status of the behavior that is to be interpreted insofar as they both describe the behavior in question as action, which is intimately related (e.g., instrumentally) to the intentional states that each of them ascribes to the Mulang.

The principal divergences exist in these procedures primarily with a view to their starting hypotheses. As a charitable interpreter, the ethnologist must assume that the ascribed intentional states, more precisely the propositional attitudes (A → S), can only be individuated in a network of further propositional attitudes, which comply with certain demands for coherence and correspondence, which, for their part, can only be fulfilled under the premises of the truth aptness of the propositional contents and a truth-functional integration of the contents. Against this background, he assumes that a set of expressions can be identified that are made by their speakers in agreement with the interpreter’s assumption that they are held to be true, and he explains successful coordination processes among the Mulang on the basis of the (causally embedded) shared view that certain things hold true.

It is these truth-related assumptions that the musicologist neither needs nor is directly suited to accept as starting premises. For the musicologist need not assume that the Mulang make music for the purpose of truth-conditional representation in order to be able to ascribe to them shared intentional states. However, she does want to maintain the intentional ascriptions. She categorically rejects a description of the behavior of the Mulang that views the singing as an epiphenomenal sound, comparable to the humming of bees. She rather claims that the Mulang could also not sing; that their extraordinarily differentiated musical practice is based on a sensibly ordered system; and that a conscious, subtle varying practice can be detected in the songs of the individual singers. The set of the differentiations that the musicologist needs in order to interpret the musical practice of the Mulang as a practice that consists of musical actions must describe the scope that the Mulang draw on when singing. In the interpretation of the musicologist, this set of differentiations forms a hypothetical medium, and indeed one that would lose its hypothetical status particularly if, on the basis of this assumption, she could herself carry out actions that the Mulang accept as singing and be integrated into their practice of singing.

The researcher underlines this last mentioned thought—that her thesis has a “pragmatic verification”—with an excursus about rules, since for her analysis she is by no means, as her colleague believes, dependent on the assumption that the Mulang orient themselves in their musical practice on explicitly, linguistically agreed-on rules. The singing is indeed an action, which assumes that the singers are familiar with the musical entities, but this could be explained by the participation in a social practice that includes correcting nonlinguistic sanctions without the systems of differentiations that it is based on ever becoming explicit. On the whole, the comparison with a game is more suggestive; here the participants learn to play by practically being integrated into the practice. An important aspect of such games, without agreed-on rules, is the nonverbal action of competent players, which novices would find to be sanctioning behavior; although the introduction of such a game is not dissimilar from training, the model of conditioning misses the main point. For in contrast to a conditioning, which anchors a causally describable routine in a being that limits its scope of behavior under certain basic conditions, for the young Mulang who are introduced to the practice of singing, it opens up a scope of action that provides numerous alternatives. It is thus by no means impossible that those playing music would have intentional states and that their activities in the framework of games are activities, even without linguistic rules. In short, one can assume that the Mulang first intentionally sing and, in doing so, draw on a set of possible actions that are generated with the help of a nonlinguistic practice of drawing distinctions; to do this, however, it is not necessary to assume, second, that the mental states that the Mulang articulate with the help of singing are linguistically structured. In contrast, precisely with the concept of nonlinguistic mental states, one could accommodate the intuition that artistic intentions cannot be translated into linguistic intentions.

The ethnologist admits that the musicologist’s thesis offers an interesting theory about the phoneme practice of the Mulang, but he is critical of the fact that the musicologist effectively provides no theory about the reason that the Mulang make music, and especially of the fact that she cannot explain why individual Mulang in certain situations sing one rather than another sequence. What the musicologist lacks is an explanation of the behavior of the Mulang that makes use of reasons or even of thoughts in an essential sense. In order to deal with this objection—which has a certain weight since it questions in a certain sense whether the analysis of the musicologist is in fact concerned with a form of understanding—the musicologist sketches out two strategies:

1. Within the framework of the first strategy, she accepts that the ascription of reasons always presupposes a language, but she questions that ascribing reasons or having reasons is constitutive for action. It is sufficient for actions that an activity is chosen from among possible forms of behavior.¹³⁰If one accepts this strategy, then it is necessary—in conformity with the musicologist—simply to insist that a behavior that is carried out against the background of similarly alternative choices has another status than a behavior for which there are no such alternatives for the actor.

The first objection of the ethnologist to this strategy is obvious. With smug undertones, he asks: “For which form of animal behavior, for example, is there no alternative for those animals? Can’t we always describe the behavior of reasonably complex animals as the realization of behavioral options that these animals have? Doesn’t a bee have the option of flying onto one or another flower? And doesn’t a bird have the option to sing this or that melody?”

The musicologist, who has been shaking her head while listening, does not even think about dropping these ideas. First of all, it is questionable what it is supposed to mean that a bee chooses among different flowers, for the flight of the bee over a meadow can be reconstructed completely in a causal—if necessary, however, in a functionalist—vocabulary. There is no problem imagining that a causal mechanism is implemented in a bee that leads it to fly onto all flowers that would be classified in a certain way, according to its perception, as long as its body exists in certain states. In any case, to our knowledge, the order in which bees fly onto flowers does not demonstrate a certain pattern; nor, in any case, she believes—should such a pattern exist—can intelligible variations of such patterns be made out.

At the same time, she specifies her criteria by pointing out that the concept of intelligible variation does not presuppose a mere change in activities, but a variation that is based on discrete, identifiable elements, thus a type of variation that assumes the type-token differentiation.

The musicologist characterizes the case of the bird as admittedly interesting and challenging, for the singing of the bird satisfies a number of criteria that her theory of actions, as a chosen activity, is based on. So it can be said that birds in a certain sense learn to sing, that birds vary their singing, that a refined varying practice is “honored” in social (i.e., sexual) relations, and that the influence of external causal factors is hardly suited to explain the “creative” aspect of singing. On the other hand, however, it is questionable whether the singing of the bird cannot be completely explained with a functionalist vocabulary. For the new work on birdsongs shows that the singing serves both to court and to acoustically mark a territory for a specific singer. If one would like to ascribe mental states to birds on the basis of this analysis, one is hardly required to assume more than the disposition to signal something like, “This territory is taken by me, singer X,” and “I am an attractive male.” Beyond that, however, for the case of bird singing, one simply cannot make out a social practice that can be sensibly interpreted as a teaching practice.¹³¹

Of course, it is difficult to prove this difference in individual cases; it is, however, possible to draw on further criteria. For example, the following addendum: behavioral alternatives can be prognotized in line with certain parameters; they result from an arbitrary pattern of alternatives that exhibits an intelligible order and whose differentiating attributes cannot be reduced physicalistically. Above all, this last point—the criterion of immunity to physicalist reduction—is important and must be better worked out in a developed version of the theory. However, it would be possible to maintain the following.

The attributes of the (alleged) musical actions of the Mulang that the musicologist has used for support in her analysis of the behavior of the Mulang are not physicalist attributes, but “medial” attributes, so she claims. Here, this means the following: her description of the music of the Mulang does not refer to the physicalist parameters such as frequency, amplitude, duration, or the frequency spectrum, but to parameters such as the key, volume, dynamic, tempo, rhythm, and agogics, such as tone and harmony. One could indeed imagine an agreement between the vocabularies such that every musical parameter X, with the value a, can be ordered to a physical parameter Y, with the value b, but these relations are not reversible. As an example, the colleague calls to mind the history of standard pitch a, which has remained constant as a musical entity since it was introduced into European music but viewed physically moved from 415.5 hertz in Bach’s time to 457 hertz in 1880, after all a difference—musically spoken—of more than a half tone. Accordingly, described physically, the action of playing a C note at Bach’s time was a different choice than it was at Mahler’s time, while musically they have remained the same. The theory of action that views it as chosen activity is thus by no means unrefined; for in any case, the following criteria must be fulfilled so that an activity (X₁) can be described as an action.

1. X₁is an activity that can be carried out at least by some members of a group of beings.

2. There are alternatives (X_n(n ≠ 1)) to X₁for which it holds that

a. X_nshare a series of attributes with X₁; and

b. X_ndistinguish among themselves with respect to at least one attribute;

c. some of the attributes with the help of which X_ncan be distinguished are socially conferred attributes (so that there is a history of these attributes).

3. The alternatives can be allocated to various forms of social reactions insofar as they often lead the observers of the action, which implements one of the alternatives, to different connecting actions.

2. The second strategy of the musicologist accepts the argument that reasons are essential for actions, but attempts to defend an alternative analysis of reasons, which does without the concept of truth. From her research journal, it can be seen that, in the conflict with the colleague, she pursued this strategy with the following arguments. First she claimed, certainly to the surprise of her colleague, that the procedure of ascribing reasons for the behavior—which by virtue of having reason ascribed to it is action—of persons who express themselves linguistically would not differ in principle from the procedure that she would follow to explain the singing, which is based on reasons: so, as the linguist counts the expression situation as one of the causal effects on the speaker, it would certainly still be possible for her to take internal factors—for example, inner perceptions—to be causes for the behavior of musical articulation. The disadvantage of such a construction is indeed obvious insofar as it is necessary to forgo the (alleged) public character of external, causal-effecting occurrences, understood as causes,¹³²but that by no means implies a principal lack of intersubjectivity.

The singers could initially assume that softer relationships exist between the choice (or structure) of the singing and external situations, that are also externally perceptible or sensible by others, which could be comprehended by potential interpreters. These relations are softer than causal relationships above all because the perceptions cannot simply be interpreted as effects brought about by expression situations; rather, one must assume the singers’ experience of the situation, which is itself evaluative. There is, however, no problem imagining that expression situations, which, for example, are influenced by the weather, trigger sufficiently similar perceptions in the singers and the listeners, so that the song can exist in an articulatory (and in a certain sense interpretive) relation to these perceptions. If, however, the singers can assume that the relation by which their singing is related to their inner states (perceptions, feelings, etc.) could be understood by others insofar as others could interpret the perceptions that they link with the singing to be precepts for identification or an expression of an inner state in which they could find themselves, then what reason is there not to say that the singers had reasons for their action? These are indeed not reasons that we can only individuate if we attribute truth-apt beliefs to the singers, but they are nevertheless reasons.

The response of the linguist to the first strategy is not known. Above all, he was demonstrably irritated by the second strategy. He objected that if we ascribe a singer P a rationalization for her singing action S, this in essence assumes the following form (which complies with [H1]):

1. P thinks S expresses feeling F.

2. P would like to express F.

3. P does S.

And it is clear that P holds the sentence “S expresses F” to be true. To what degree then is the discussion here of reasons in a sense that is not dependent on the concept of truth? The musicologist responds with a counterquestion: What does it mean to speak of “truth” in connection with a subjective belief that is not amenable to examination? However, the following is more important than an answer to that question.

If the Mulang hear a song, then they have an experience that, among other things, leads to the emergence of feelings. We expect that the Mulang have a disposition to react to various events with feelings, and it is, first of all, this disposition that is responsible for the emergence of feelings. Then we could assume, under the assumption that there is a practice that introduces variances into the singing, that the Mulang could make discoveries of the sort that certain songs bring about certain feelings. They perceive the song and they perceive effects of the song. (There could also be other nonlinguistic reactions to these feelings, for example, gestures or grimacing, so that social repercussions of a song could be correlated to activities to which the Mulang emotionally react.) What reason is there not to assume that the Mulang vary their singing with a view to the feelings it produces and to maintain that the song comes to symbolize the feeling that it generates? And why shouldn’t we be allowed to assume that, with a differentiation of the song, there is a differentiation of the feelings, and thus of the mental states of the Mulang? In contrast, there is no reason not to maintain that the musical forms of expression make available the only suitable means for identifying the feelings that are expressed or generated with their help. Herewith, however, we would come to the heart of a thesis that is stronger than the thesis that the activities that can be depicted as resulting from choices are actions. We would arrive at the thesis that, besides language, there are other means for individuating mental states or thoughts, and these nonlinguistic means for individuation make it possible for beings to refer to these states.

But what have we gained with a view to the problem of the structuring of the expressions that are to be interpreted by the interpreters? More precisely, on the basis of the reflections, can it be plausibly shown that it is possible to structure the expressions in a way that is not dependent on the interpreter taking the “syntax” of the expressions from the perspective of the truth functionality of the hypothetical components of the expression? Can it thus be shown that the structuring of a medial expression in fact is independent of a concept of compositional identity, which in the late Fregean manner is explained employing the concepts of truth functionality?

The look at the problems of field research should have made it clear that it is possible to structure a medial expression using media-theoretic means, which is independent of the concept of truth. Admittedly, to do this, the design of the situation of radical interpretation must be broadened. In contrast to Davidson’s construction of the situation in which it is initially sufficient that the interpreter correlates utterances of an individual speaker with utterance conditions, the situation in a media-theoretic perspective must be designed so that the interpreters can simultaneously observe expressions of a being that articulates itself and the effects of these expressions in a social context. The media-theoretic interpreter is dependent on the observation and analysis of the social practices of producing and receiving medial expressions. If this presupposition is fulfilled—a presupposition that remains compatible with the idea of radical interpretation—then the empirical data that must be analyzed in order to make it possible to identify elements of the medial expressions is available to the interpreter; such an identification is possible insofar as the interpreter can simultaneously observe that the reactions of the respective recipients covary with the structure of the medial expression and not with accidental attributes that appear when an expression is made.

3.2.3.1. Summary

If we put ourselves in the position of an interpreter of odd beings B, who are physically rather like us, but do not engage in activities that are sufficiently similar to our speaking, we must accept an especially prejudice-free type of observation in order to determine whether these beings have something that we should call media. Under these conditions, when would it be justified to say that these beings have media at their disposal? Now, this interpreter must (of course with the help of his language)

1. identify types of performances

a. that can be performed by more than one individual;

b. that, on the basis of bodily and the ascribed mental capabilities, could be performed by most B_i;

c. that will lead to regularly observable changes in behavior for those B_iwho witness a performance P_x; here some of these behavioral changes are implementations of P_x-similar expressions; and

2. classify some of these types of performances into sets (because of 1[b]) so that the performances can be described within each such set as performances that share certain perceivable attributes and that differ with respect to other perceivable attributes. (These types of performances are sufficiently similar to one another and are distinguished on the basis of changes in the attributes that are responsible for their similarity.)

At an elementary level, from the perspective of the interpreter, media are thus initially sets of transindividual types of activity, which the interpreter views as belonging together in varying ways. In order to have an example, let us suppose some B_iperform quite a number of bodily movements, without any of them making a sound. The interpreter proceeds as follows:

1. On the one hand, the interpreter classifies the movements that she identifies as “walking,” “breaking something off,” etc.; on the other hand, she classifies types of movements that are displayed by some B_iand that lead, with a reasonable regularity, to changes in the behavior of those who perceive these latter movements.

2. The interpreter sorts out those behavior-changing types of movement

a. that are only visually perceptible, only pertain to the hands, and can be described as consisting in a variation of the position of the fingers (crossing the index and the middle finger, spreading out all the fingers, etc.), and those

b. that are only visually perceptible and that lead to an optical change in the cliffs (color one place on the cliff with red color, another with white color, one covered with moist lime, etc.).

In the case of the alleged finger-positioning medium, technically this means something like the following: the angle of the fingers and their distance from one another form the elementary parameters of the medium, and every type of finger positioning is a medial element or a medial constellation. With a view to the last distinction, the interpreter can only be sure if she has good reasons to think that individual finger-positioning performances do not lead to behavioral changes in the recipients, but that sequences of such performances do. If the sequences bring about behavioral changes, then the types of position constitute medial elements, and the sequences constitute medial constellations. If the performance of types of finger positioning is already sufficient for behavioral changes, then the medium is so simply structured that media elements and media constellations are coextensive. A medium whose elements at the same time also depict all the medial configurations that are possible for it has a scope of possibilities that is described completely by the list of all the performance types that are possible by applying its elements. The scope of possibilities of a medium that, for example, allows more complex medial configurations to form by sequencing finger positioning has, accordingly, a theoretical scope of possibilities whose power corresponds to the number of all the medial configurations that can be created by the sequencing types of finger positioning (a number of possibilities that of course only differs from infinite if additional restrictions on the length of the sequences come into play).

It is clear that our interpreter brings her entire language to bear in theoretically organizing her observations of B. But the interpreter is more than a theoretically distanced observer of B’s practices: she herself classifies the types of performances relative to her reactions, and in producing this classification (for example, of certain finger positionings), objectively measurable attributes of the act of finger positioning are not relevant; what is relevant are rather those attributes that the interpreter, on the basis of contingent mechanisms (for example, of pattern recognition), classifies as belonging together. The interpreter classifies on the basis of observer-relative attributes that the events or the states have for her, and she can assume that the reactions of B occur on the basis of the same attributes. The example of the cliffs should illustrate that the interpreter also can find support in the consequences of activities, which, in the form of relatively stable states of affairs in the world can be detached from the performances that bring them about. Admittedly, these states of affairs are only relevant as long as they are interpreted as consequences of the activity of B.

3.2.4. Interpretation Without Linguistically Apt Interpreters

In the last step of the introduction to a media theory from a broadened interpretationist perspective, I will attempt to show that, employing media-theoretic means, an assumption can be put to rest that is a premise of all the preceding reflections aimed at explaining the possibility of thoughts, the assumption, namely, that the transition from A-intentional states to thoughts is carried out with the help of external interpreters who already have thought available to them. In the following I would like to clarify, on the basis of a myth about the origins of media, that we can do without this assumption, that the interpretationist perspective can be reconciled to a genealogical perspective, and that consequently its phylogenetic problems can be overcome. Here it is important to avoid presuppositions of the established interpretationism, which, with a view to the question of how the primacy of the interpretation for the emergence of language, meaning, and mind can be reconciled with assumptions of evolutionary theory, is pacified with the answer that all speaking beings have parents who already possess an interpretation language. On the one hand, this information, however, opens a question of interest to evolutionary history—namely, whether we should assume that people who indeed must have undergone a transition from a nonspeaking species to a speaking humanity had parents, who, in opposition to interpretationist assumptions, on the basis of their genetic endowments, were interpreters of their children. On the other hand, one cannot see offhand how this transition can be construed to have been gradual, since the ability to ascribe rationality that is attributed to the interpreters has the character of a constitutive all-or-nothing criterion, which eludes a gradualist reconstruction. A myth of the origin of media must consequently be able to tell a story of the development of competencies for interpretation. Here it is decisive that these competencies can be understood to be gradually developed, but at the same time that the qualitative niveau characteristic for beings with thought can be designated.

Against the background of the previously developed four-phase model,¹³³the task that we are confronted with regarding the construction of this myth can also be described as follows: we must tell the story of the social development of medial and interpretive competencies in which, at least at the beginning, all of the interpretive abilities of the adults that we can assume within the four-phase model initially have to be replaced by arrangements of dispositions in the beings that interact with one another. For unlike the interactions in the four-phase model, in the list of characters of the myth, no interpreters can be listed who already have a fully developed mind at their disposal. Of course—and the expression “myth” should emphasize this—this is not an empirically adequate development story; the goal of the myth would be reached if it were shown how such a story could be told in the interpretationist framework broadened by media theory, without the inadmissible assumptions.

At the center of the myth there is a group of primates that perhaps includes about eighteen individuals of various ages and of various physical statures. Outside observers are left with the impression that a hierarchy has been established in accord with these differences, which, for example, is seen in the fact that the weaker animals draw aside from the stronger ones if they cross paths. An idea of the myth is to assume that precisely this asymmetry in power between individuals is a basic aspect of those dispositional reconstructable social constellations that allow the individual interpretive competences of mental beings to be substituted in explanations.

3.2.4.1. A First Version

Let us now go to the edge of a clearing: one of the strong individuals (Alpha) suffers now and then from an itch on a place on his back that he cannot reach. After a failed attempt to ease the itch by scratching his own back, Alpha blurts out a noise that is obviously perceptible by the others; some of them look to the source of the sound. Alpha responds to the fact that he cannot manage to ease the itch with a nervous aggressive behavior, expressing frustration, and with threatening gestures and minor encroachments; in doing so, he agitates the weaker animals. Delta, one of the weaker animals, is obviously bothered by these bouts and devises a behavior that makes it possible for Alpha to stop the tedious interruption to the napping: he scratches Alpha. In reaction to this, Alpha calms down and the agitation of the group abates. However, it is not long before Alpha is once again afflicted, and once again a sufficiently similar noise is blurted out. Then Delta does what earlier had already restored the peace and quiet. If this interaction pattern becomes somewhat established, for Delta, Alpha’s noise acquires the status of a sign, an indication. Its meaning_iis parallel to the meaning_iof seeing stinging nettle, which the animals avoid touching after sufficiently frequent experiences that touching it leads to an unpleasant skin irritation. And Delta’s scratching intervention is parallel to the avoiding of stinging nettle.

If we look more precisely at our burdening assumption, it is clear that so far, in any case, we have not assumed that Alpha was following an intention when he produced the noise or that bothering the group was related to intentions. Alpha merely has the disposition to react to a disturbance to his well-being that he could not redress with a noise and with aggressive behavior. And Delta has the disposition to avoid experiences that he assesses as negative; so Delta reacted to the noise as if it were a sign for the occurrence of an ill feeling. As far as I can see, there is no reason not to say that Delta interprets_iAlpha’s noise by behaving in a certain way; here this interpretation is nothing other than a “perceiving as,” which leads to a form of behavior.

For the further development we must now assume that Delta’s acquired disposition to react to Alpha’s noise is not a form of behavior that admits no exceptions. We must thus assume that Delta can also hear the sound without the soothing behavior ensuing as an effect. In other words, it must be possible to understand the acquired disposition as making it probable that the corresponding behavior is performed, but not as making this inevitable. Just as animals that are fleeing accept the contact with stinging nettle at a cost, it must be possible for Delta to accept the aggression at a cost, if other dominant impulses determine Delta’s behavior. If this is possible on the basis of Delta’s biological endowment, then the following scenario is possible. If Delta does not react to the noise in the tested way, this then has consequences for the other members of the group; for Alpha’s aggression is directed at lower-ranking animals with a relative lack of specificity. If the animals of this group can observe a sufficiently stable connection between Alpha’s noise and Delta’s pacifying action—and we can assume that the animals, as primates, intensively observe one another—then it is not temerarious to assume, for example, that, for the sake of his own quiet, Gamma, an animal that occupies a similar position in the hierarchy to Delta, reacts to Alpha’s noise, should Delta at some point not intervene for some time, with a behavior that is equivalent to Delta’s, thus appeasing Alpha. What this step yields is that an interpretive correlation between the noise and the appeasing behavior becomes detached from Delta’s instantiations of this relation, and it can now be instantiated by other animals. However, even if this correlation is, as it were, integrated into the cultural stock of the group in this way, in any case, we are here dealing with interpretations via action, and indeed via action that is directly preference relevant.¹³⁴

In order to move beyond the outlined pattern, a further step is now necessary: for one, the (aggressive) behavior that functioned as a negative reinforcement for the performance of the preference-relative behavior must also be performed by other individuals than merely Alpha so that not only the interpretation of the behavior but also the reinforcement behavior are able to be disconnected from the original protagonists. For another, interactions must be established in which the interpretive behavior is not directly preference relevant, but in which it, for its part, has a demonstrative character. The first step could be achieved simply if the stronger members within the group of the lower-ranking animals, through preference-relevant behavior, urge the relatively weaker animals to react to the noise in the established way. In this way, they would ensure the extension of a reliable structure of reinforcement, which ought to lead to the establishment of a correlation between the sound and the designated form of behavior in the group. However, although this step looks easy, it appears to entail too many assumptions. For the animals that urge the weaker ones to engage in the appeasing behavior are interpreters in an overly demanding sense; we would have to explain their behavior by assuming that they understand the observable connection between the noise and the appeasing actions of other animals as a “semantic” relation, which they use to achieve their preferences. This, however, would be irreconcilable with the conception of intrinsic-intentional states, for intrinsic-intentional states systematically preclude the reference to other intentional states, because they, as a result of their individuation through conditions of satisfaction, would be identical with these states, or a duplication of mind would be implied.¹³⁵

On the other hand, however, the step is too small insofar as only a deficient interpretation could be reconstructed in which the meaning_iof an expression could not be provided by another expression, but in the form of a preference-relevant connecting behavior. However, this is irreconcilable with the interpretationist framework, because an interpretationist theory of content is based on a correlation between expressions whereby the interpreting expression is itself interpreted, thus is interpretable.

If I see things correctly, then the relations in this version of the myth can be analyzed on the basis of the ascription of intrinsic-intentional states. We at no place assume that the interacting primates ascribe intentions to one another so that a second-order mental relation would be brought in play. In order to proceed beyond this level, it would have to be possible to take recourse in a model that we were able to draw on with no problem in the four-phase reconstruction: what we would need is a model for the internalization of an interpreter or for the internalization of an interpretation. However, in solving the phylogenetic problem of the development of mind, we do not have interpreters—in any case those with the common semantic and folk-psychological competencies—at our disposal. If, in the face of this problem, the myth is to have any chance whatsoever of fulfilling its explicative function, then we must introduce interpreters whose interpretive competencies are adapted to these assumptions. These interpreters would not have to be steered by beliefs in their interpretations, but could be moved by dispositions. And it would have to be shown to be plausible that these dispositions themselves are malleable so that the interpretive_ibehavior that they cause can be modified and differentiated in reactions to this behavior. Unlike instrumental communication, which I have attempted to develop in the first version of the myth, the second version might rather play out in a framework that is related to the phase of affective communication.

3.2.4.2. The Phylogenesis of the Mind, the Second Take

From the perspective of affective communication, we now observe animals that have dispositions to react to certain social events with expressive_iforms of behavior. This is not uncommon among higher primates: if playing pups surpass what is tolerated by the adults, they are responded to with threatening gestures; if they are fondled by other animals, they put on a “playful face”; if they are separated from important reference animals, they exhibit signs of sadness. _iLet us then presume that all animals in the group are disposed to express their emotions_i¹³⁶with facial expressions, bodily posture, noises, etc. Nothing prevents us from describing these expressive acts and the attentive observation in a functionalist vocabulary that places these forms of behavior in a close relationship with the survival of the group. With a view to the construction of an interpretive relation, whose internalization would simultaneously be the birth hour of mind, expressive acts assume an interesting position insofar as we can expect, on the one hand, that all members of a species of primates have a genetic disposition to react to types of expressive acts (by performing expressive acts) so that stable reactions are ensured. On the other hand, however, we can also assume that such reactions can be learned, so that types of reactions can have a history, which is connected to traditions_iwithin the group and, for example, do not occur in other groups of the same species. Besides a set of inborn forms of behavior, we also find in the group forms of behavior that are passed on through associative learning.

If we next assume that animals are endowed with brains that allow them to carry out analyses of conditions in Watsons’s sense¹³⁷—and this is not an assumption about the existence of intentional states—then, on the basis of the observation of interactions that are appropriate for such analyses, we can justifiably assume that the animals learn to identify types of bodily states proprioceptively that sufficiently frequently lead to certain expressive social responses, regardless of whether the relative stability of the reaction has genetic or historical causes. Let us further assume that playing_iis a widespread practice, especially of the pups in the group, and indeed a practice that—in contrast to the practices of nonhuman primates that we know about—entails varying the performance of expressive behavior. Then we can imagine that, in the framework of such playing, expressive types of behavior become established because, on the basis of the shared genetic dispositions, we can expect that some of the expressions that are developed when playing might activate the attentiveness patterns of their recipients and lead to an expressive connecting behavior, which is tied to the previous performance of certain expressions. Within the framework of such playing we thus should be able to see interactions in which individuals react to gestures with other gestures and to noises with other noises, and not with immediate preference-relevant activities. If we assume a playful practice of this sort, then, with the help of conditionality analysis and the processes of forming expectations that can be explained with the concepts of associative learning, we must be able to show the plausibility of the further steps. Let us assume that, in the framework of playing, Delta performs a gesture G, which captures Gamma’s attention in a special way, and Gamma reacts with sufficient frequency to the performance of G, for example, by imitating the gesture. Then Delta’s conditionality analysis leads him to identify the necessary and sufficient conditions for the form of behavior that Delta must carry out in order to get Gamma to imitate the behavior. If this interrelation becomes stabilized, then Delta will come to expect that Gamma will react to the performance of G with the performance of G; one can imagine that when that reaction does not occur, Delta’s reaction will aim at generating conditions that satisfy the expectation. Here, if expressive reactions occur often enough, Delta will identify the proprioceptive perception of the state that leads to the performance of a behavior with a state that normally brings about a certain reaction.

Again, in contrast to the implementation of this construct in the four-phase model, here we cannot assume that recipients are endowed with folk-psychological competencies. Thus a theoretical framework is needed in which a complex balance is possible between a sufficient constancy in the reactions and a sufficient performative “freedom.” In the model here, the causal effect of a protomedial expression on the recipient of the expression ensures the constancy of reactions; this can be explained, for example, by species-specific attentiveness patterns, specificities of perception, and dispositions to imitative learning, as well as a positive evaluation of the imitating performances.

Once this level of interaction is reached, the behavior is interpreted_i, not by preference-relative activities, but by intrinsically empty forms of behavior that become connected with the activities by being embedded in a network of expectations. At this fundamental level, behavior that reacts to the failure of what is expected to occur must and may thus not be understood as a sanction. It has no normative significance. In contrast to Brandom’s view, here sanctions are thus not placed in a fundamental theoretical position, establishing correctness or rightness by positively or negatively reinforcing forms of behavior.¹³⁸

The possibility of further developing these rudimentary media is now primarily dependent on two factors: first, the possibilities of a species to differentiate the scope of possibilities of media and to use them articulatively; second, the possibilities of a species to place types of performances under conditions of appropriateness, which in the case of nonlinguistic performances are experiential contexts and in the case of linguistic performances are prominently truth conditions or conditions of satisfaction.

If I am correct, in the construction of a myth that attempts to make the social origin of mind intelligible, media-theoretic means can allow an intermediate step, which—because we expect a form of communication without reasons—permits the process of establishing communications media to be decoupled from a process of applying communicative media with reasons.¹³⁹If one accepts the view sketched out here, then the origin of mind is not based on sanctions, but on play.

3.3. FURTHER THEORETICAL FOUNDATIONS OF MEDIA THEORY

The preceding sections of this chapter should clearly indicate how the interpretationist theory of mind, of meaning, and of action can be broadened by means of media theory so that those cases of communicative action that do not play a paradigmatic role for the established interpretationism can be adequately understood. For this, I have introduced a media concept that apprehends media as means for individuating thoughts. In the following I would like to further hone this concept, for one, by pursuing the question of what it means to say that there are media: in short, in which way do media exist?

In a second step, on the basis of the earlier developments, I will suggest an ordering framework that makes it possible to subjugate the ubiquitous talk of media to criteria. To this end, I suggest a typology for media with the help of which it should be possible to examine which of the types mentioned at the beginning of chapter 2 can rightfully be characterized as media and how these media behave toward one another.

3.3.1. Comments on the Ontology of Media

As we have seen, among the striking deficits of the system-theoretic media conceptions is the remarkable evasion of the following questions. Which epistemological access do we have to media? Which forms of existence do they have? In the preceding chapter I attempted to show that this deficit is no coincidence; for both Parsons as well as Habermas must assume—because of the double function that they burden the media with as intersystematic entities and as action-coordinating social entities—that media have both objective effects and intrinsic attributes accorded to them independently of us, as well as observer-dependent attributes that are dependent on us for their existence and that only have effects because actors ascribe these attributes. On account of the dependency of social systems on the emergence of improbable communications, Luhmann’s expression “There are systems!”¹⁴⁰can be extrapolated as “There are media!” But rather than squarely addressing the question of the ontology of media, as a result of his trust in a naturalized epistemology, he assumes that such questions are purposeless.

A media theory that attempts to cast off its provisional status should, however, be able to answer the question of whether media exist like stones or shovels, like paper or banknotes, like balls or games, like noises or languages. If, in the face of these alternatives, we attempt to specify the ontological status of media, we will initially tend toward the choice of the second disjunctive; from the series of the second disjunctives, we will tend to maintain that the form of existence of media is more comparable to the form of existence of banknotes, games, and languages than to the form of existence of a tool. The background for this classification is the thesis that:

(M10) Media are social entities.

An ontology of media would thus have to clarify what social entities are and what form of existence they have. In order to approach a robust answer to this question, I would like, first of all, to attempt to use the theoretical apparatus that Searle developed in his third larger work, The Construction of Social Reality, from 1995. In connection with a critical presentation of Searle’s basic reflections, I will take up those parts of his program that are independent of the specific presuppositions that indeed are characteristic for Searle’s purposes but that are incompatible with the basic assumptions of my reflections. Among these is, for example, the assumption that intentionality is completely a biological phenomenon and does not first come into play in a social practice of interpretation. Beyond that, however, all of those assumptions in which language plays a fundamental and exclusive role for the origin of social entities must lead to a modification of his analysis.

Searle’s reflections, which, in the first step, aim to examine the presuppositions of our social reality, assume first of all that these presuppositions

neither are made accessible by sense experience;

nor are accessible from an internal phenomenological perspective insofar as the objects with whose help we act socially do not reveal anything specific to the phenomenological view;

nor can be apprehended from an external behaviorist perspective, because the mere description of social behavior does not bring the structures into view that make this behavior possible;

nor become transparent in reference to the assumption of cognitive science or linguistics that the behavior of social actors can be understood to be the result of unconscious rule following (particularly if it is assumed that these rules are, in principle, not accessible to consciousness), because the rule following itself is a part of the explanandum.¹⁴¹

In order to gain a more precise view of the social reality and its presuppositions, Searle attempts, first of all, to position the forms of existence of social facts within the framework of a three-level rough ontology, which he portrays in broad strokes on the basis of the atomic structure of the material and the evolutionary theory of life:¹⁴²

(1) Materially, the world consists entirely of particles that are subject to force fields and organized into systems, whose borders are drawn by causal relationships so that the relation between entities of the first level of ontology can be described in a causal vocabulary. (2) Some of these systems are living systems; these types of systems develop by mutation and natural selection. One fraction of these systems has developed subsystems that we call nervous systems, which, for their part, are able to generate consciousness; here consciousness is understood as a biological and to that extent as a physical attribute of higher-developed nervous systems. (3) Along with consciousness, intentionality is developed, i.e., the ability of mind to represent other objects and states of affairs in the world; here intentionality is an attribute of the mental representations through which they refer to something or are oriented toward something.

It is obvious that this rough ontology is quite problematic and is incompatible with the interpretationist assumptions of my view insofar as it views consciousness and intentionality as biological phenomena. Nevertheless, the problems initially do not pose a serious obstacle to the usefulness of the Searlean apparatus insofar as they are not dependent on a certain interpretation of the rough ontology and the steps of the rough ontology can also be reconstructed to be reconcilable with the assumptions of interpretationism.¹⁴³For Searle, the rough ontology initially only has the function of providing the background for an epistemological question, namely, the question of how we have access to social facts within this ontology. The distinction between intrinsic and observer-relative features of the world is foundational for the access to social facts.¹⁴⁴Here, intrinsic features are those that exist independently of the observer, that is, all of those features of the entities of rough ontology that exist independently of mental states, including the existence of mental states. Searle distinguishes observer-relative features from intrinsic features, that is, from those features whose existence is dependent on the existence of observers that ascribe a feature to an object or process in reference to their perceptions, beliefs, and expectations; an example is the feature of being a tool. Unlike intrinsic features, observer-relative features are not ontologically objective but ontologically subjective. The status of ontological subjectivity, however, by no means precludes entities that possess such features from being objects of epistemically objective judgments. In contrast to aesthetic assessments, judgments about whether something is a tool or a word are not merely of an epistemically subjective nature; rather, despite the ontologically subjective status of the feature of being a tool or a word, they are epistemically objective. Figure 3.5 provides the course, within the apparatus of the Searlean distinction, that we must follow in order to encounter the type of judgment that is specific to the features of social entities.

Against the background of this distinction, the following is decisive for every arbitrary observer-relative feature F_b: namely, logically precedent to F_binhering in an object O is that it appears to us that F_binheres in that object; for “seeming to be F_bis a necessary condition of being F_b.”¹⁴⁵Searle connects his hope that he has found an adequate starting point for the investigation of social facts to this particular consequence—namely, the primacy of ascribing a nonintrinsic feature over the inherence of the feature. Besides the analysis of the ascription of nonintrinsic features, which Searle investigates in reference to the example of the assignment of a function (1 below), the concepts of collective intentionality (2 below) and the generation of institutional facts by constitutive rules (3 below) are the theoretical instruments with the help of which Searle attempts to illuminate the structure of social facts.

Figure 3.5 Searlean types of features

1. Against the background of the apparatus in figure 3.5 for drawing distinctions, functional assignments, in which an object G is ascribed a function Y, can be qualified as follows:

a. For all functions Y, it holds that they are observer-relative features.

b. From this it follows: functional assignments do not introduce any new intrinsic facts.

c. Functional assignments operate with observer-dependent values.

d. Under the presuppositions (a)–(c), the schema “G has function Y” can be analyzed as follows:

i. G and Y are parts of a system that is in part defined in reference to purposes, ends, and values.

ii. It is expected that G performs Y (even if G occasionally, or even often, fails to do so).

This specific type of functional assignment can be action related (“This is a screwdriver”), or it can play an epistemic-theoretical role, whereby the functional assignment acquires the status of a heuristic hypothesis (“The lungs provide the body with oxygen”). Among the action-related functional assignments Searle counts those that assign a representational or presentational function to an object. So, stamps on paper could take on a function analogous to tools relative to presentational intentions.

2. If we now accept that functional assignments are a necessary criterion of social facts, then the question arises regarding how functional assignments—beyond mere individual acts of assignment—can acquire a social character so that they can become an object of an epistemically objective judgment at all. Searle entrusts the solution to the problem to a theory of collective intentionality (we-intentionality). His simple claim is: collective intentionality is a primitive biological phenomenon that cannot be reduced to forms of individual intentionality.¹⁴⁶To provide an example of a behavior that might be explained with the help of the phenomenon of collective intentionality, Searle introduces pack-hunting animals, which is supposed to prove that beings without language can pursue common, i.e., coordinated, goals, and sometimes even with a division of labor. Here, Searle places a lot of value on the view that collective intentionality, as is also shown, for example, in a soccer match (“We are playing soccer”), cannot be reconstructed from the first-person intentionality of the individual players (“I play soccer, Y plays soccer, and . . .”). In an adequate reconstruction of the relationships between we- and I-intentionality, it should rather be shown that “wanting X collectively,” in the sense that “we want X,” is the presupposition required so that individuals can pursue intentions that refer to X, under the assumption of this common desire. Here Searle emphasizes that the assumption of we-intentionality is compatible with methodological individualism because we-intentionality is not dependent on its implementation in higher-order subjects and is thus not damaging for his rough ontology. With a view to the explanation of social facts, it follows then that:

(WI) We-intentionality is a necessary condition for the emergence of social facts. Individuals who contribute to fulfilling these conditions share an intentional state with others of the form “We intend X.”

Regardless of how one judges the analysis of we-intentionality in detail, it is in any case clear that a mechanism is needed that ensures functional assignments the status of epistemological objectivity. However, that appears to me initially only to assume that, for example, all of those participating in a game can draw on criteria that can be intersubjectively examined in order to be able to move in a field of shared functional assignments.¹⁴⁷

Although the microanalysis of collective intentionality is not of enormous significance for my reference to Searle’s theory of social entities, I do not want to fail to mention that Searle’s analysis of we-intentionality is problematic in two respects.

From a quasi-epistemic perspective, it is first questionable whether the view that “We intend to do X” can be analyzed differently than “I intend to do X, and Franz intends to do X, and Paul . . . ,” because the ascription of an intention can only take recourse in individuals as objects of predication. For if we analyze the deictic expression “we,” we do not point to a higher-level entity with the expression, but to a bunch of individuals who are the bearers of intentionality and who integrate the bunch into a group; the subject pointed to, as the subject of the intention, is not a collective, but an individual. It is instructive to test this analysis on cases in which someone is deceived about what the object of the intention of the participants is. If, for example, it is found out that the opposing soccer team has been bribed (as occurred with Schalke 04 in the 1970s), and a basic condition for the soccer game—namely, the desire to win—is not fulfilled, then the proposition that is supposed to express the intention of the participants as “We intend to play soccer” is not applicable for half of the players, and it is questionable how the intention of the honest players should have been indicated under these conditions after it was found out that this was a case of deception. In this situation—if one wants to retain an intentional description—one must indicate the intention by saying, “I intended to play soccer, but the others did not.” From this, it follows that “not: we intended to play soccer.”

As a result of the reservation just expressed, it is, second, admittedly difficult to see how it can be precluded that the mere synchronized pursuit of I-intentionality is counted among the cases in which the coordination of those actions that are commensurate with the intentions are among the things intended. However, this particular distinction does not appear to me to be a problem if we shift the commonality of the intending from the collective subject to the content of the intention: “I intend, together with P_x, to do F, and the other persons P_xintend to do F with me.” (Often, however, it is not even necessary to explicitly indicate the term “together with,” because it is implicit in formulations like “I want to play tennis,” “I want to play in a quartet,” etc.)

Searle’s affinity to an analysis of shared intentions in the form of we-intentionality can surely be explained by the fact that he views intentionality as a biological phenomenon, and he wants to treat the collective hunting of animals as an example of we-intentionality. With a view to the above-introduced distinction between simple and higher-level intentionality, however, I believe I can account for the advantage of this analysis, which primarily lies in the fact that it can be favorably linked to behavioral research; indeed, I can do so while gaining greater precision. It is plausible that pack hunters or that demonstrators who are fleeing from mounted police can evaluate situations with a view to whether they correspond to the conditions of satisfaction of their A-intentional states or not, and a collective behavior occurs that can be well described under the assumption of collective ends. But it is very questionable whether the individuals that form the groups can behave toward their A-intentional states by weighing alternatives. This kind of collective behavior, which needs to be explained, can be elucidated by noting that it is precisely causal mechanisms that condition the A-intentional states and thus the coordinated behavior. To emphasize the difference from Searle, I have characterized the analysis of the phenomenon that I prefer as shared intentionality. It is:

(SI) Shared intentionality is a necessary condition for the emergence of social facts. Individuals who contribute to fulfilling these conditions share intentional states with others in the form, “I intend to do X together with others, and I believe that the others intend to do X together with me and believe that I intend to do X together with them.”¹⁴⁸

3. With the concept of constitutive rules, Searle attempts to introduce a criterion for further internal differentiation of social facts that ought to make it possible to distinguish between mere social and institutional facts. The concept of constitutive rules, which Searle had already introduced¹⁴⁹in the 1960s in connection with a distinction from Rawls, postulates a type of rule that, in contrast to regulative rules, does not have the function of influencing behavior that exists independently of these rules; rather, such rules function first to make possible the actions that they simultaneously regulate. Unlike regulative rules, constitutive rules do not have the form of a hypothetical imperative ([If Y, then] do X!) but the form of defining sentences:

(CR) X counts as Y (in context K)!

With the help of constitutive rules, we can, for example, determine what a legal move of the pawn is (in chess) or what a banknote is. Constitutive rules establish possible actions; certain social practices are constituted by carrying out those actions. In contrast to arbitrary conventions, sets of constitutive rules produce institutional facts.

With the three elements—namely, of action-related function, we-intentionality (or better, shared intentionality), and constitutive rules—Searle now has the building blocks from which a differentiated ontological view of social reality can be constructed in which institutional facts are a subset of the social facts that are ascribed their institutional status by constitutive rules. In detail:

(SF) A fact is a social fact if and only if its existence implies collective intentionality and allows the assignment of an action-related function.¹⁵⁰More precisely:

a. Collective intentionality exists in a minimal sense if beings behave in such a way that their behavior can best be described under the assumption of shared goals (the case of collective intentionality [as A-intentionality] among animals).

b. Collective (shared) intentionality exists in a developed sense if there are beings that share at least one goal, one belief, or one preference X, whereby for X it holds that

i. the beings believe reciprocally of each other that they find themselves in intentional state X (besides other intentional states);

ii. the beings carry out actions that are caused by having X and that are understood by these beings as the realization of the conditions of satisfaction for X.

The definition of institutional facts now fits precisely the conception of social facts:

(IF) A social fact is an institutional fact if and only if its existence implies collective (shared) intentionality in the sense that the content of collective (shared) intentionality aims at the validity of constitutive rules or it implies the validity of constitutive rules;¹⁵¹more precisely,

a. There are beings that create social facts as understood in (SF).

b. These beings assign attributes to objects or states Y through collective intentionality, which cannot be described in a physicalist vocabulary (status functions).

c. These beings permanently accept the assigned status function.

d. The status assignment takes the form of constitutive rules (CR).

With recourse to Searle’s analysis of social reality, the question about which ontological status media have can be more precisely answered, for we can now ask whether, beyond thesis (M10), it should also be accepted that media are institutional entities. Against the background of the Searlean analysis, it is clear that media can only be considered institutional entities if their existence is fundamentally dependent on language; Searle assumes very clearly that constitutive rules are linguistic rules. This means that if we cannot make sense of a concept of nonlinguistic constitutive (somehow implicit) rules, we would have to draw the conclusion that media are social but not institutional entities. Formulated differently, where media have institutional attributes, they are dependent on language.

An interesting question is now whether the demonstrative rules¹⁵²reconstructed in the first section of this chapter might be able to take on the status-assigning function that is carried out by constitutive rules. As far as I can tell, this would mean that the content of a rule like “Do it like this . . . !” would have to entail something like “View or treat X (in context K) as Y!” Even if following a demonstrative rule could be described as if the rule follower treated something as something in context K, the ability to follow a demonstrative rule does not imply that there is knowledge of following a constitutive rule. Because Searle, however, makes the content of a constitutive rule into an object of the intentional states with which beings assign status functions, it must be assumed that they are conscious of the content of the rules. Although demonstrative rules can be understood as a form of explicit rules, it makes no sense to understand them as constitutive rules.¹⁵³This analysis does not preclude some media from being institutional entities. In these cases, however, it is clear that such media (among them possibly money) are dependent on language insofar as the constitutive rules that apply to them must be explicitly stated in language. I thus accept the thesis:

(M11) All media are social entities, and some media are at the same time also institutional entities.

Thesis (M11), which answers the question of the ontological status of media, allows at the same time a dynamization of the schematic discussion of linguistic and nonlinguistic media; in this a development in the history of the media can be accounted for that applies linguistic precepts to media that were originally nonlinguistic in a radical sense and that subjects these media to linguistically formulated regimentations. This historical process, which can be illustrated, for example, in reference to tempered tuning or twelve-tone music, can be interpreted ontologically as a process of transforming media into institutional entities.

3.3.2. A Media Typology

At the beginning of chapter 2 I presented a heterogeneous list of objects that are said to be media. The question that I finally want to pursue in what follows—and answering this in some respects challenges my reflections—is whether, with the help of the above-developed criteria, media can be limited, and thus whether it is possible to decide if something is a medium, and whether media that are obtained in accordance with this criteria can be sensibly classified. In the course of the attempt to develop a typology of media with the help of the criteria that have been worked out, it will be necessary to explain how the consequences of the theory that are developed here are related to common manners of speaking. Are the manners of speaking in general simply too undifferentiated and in need of reform, or can at least some of these everyday manners of speaking be reconstructed within the parameters of the terminology developed here?

However, before we take up the discussion of this question, it is necessary to call to mind that because of false substantiation, a typology of media runs the danger of taking a false course. Dewey already pointed out that it is impossible to indicate “where one [medium] begins and the other ends.”¹⁵⁴And he clearly indicated that this is not the result of a defect in the set of criteria that is available to us, but that it is systematically based on the fact that it is completely dependent on how something has been used. With a consciousness of the primacy of the pragmatic perspective, proposing a media typology thus necessarily entails the relativizing of the systematic demands of this typology; it means that it is based in a certain, historically contingent social process in which certain practices of application are established and others are not. A media typology is thus a typology that is specified for a certain historical situation. It reflects what we do with certain types of actions, but not intrinsic attributes of things.

And a further remark about the status of the typology is necessary: If we are interested in classifying some range of phenomena, then we can in principle attach any criteria to the range of phenomena that are selective with a view to the phenomena. However, if the classification is supposed to do more than echo the criteria used, then it is sensible to work with criteria that articulate specific attributes of phenomena in some range of phenomena. A media typology could secure such criteria by drawing not only on the fundamental criteria that characterize the range of phenomena in general, but also on those that arise from the possible relations between the phenomena. To this end, one can outline prototypical operations that are possible among the medial constellations. (I will only do this here in a very provisional and schematic way.)

3.3.2.1. Elementary Media Operations

If we have two medial constellations, we then can ask what relationship they can have to one another, and we can refer to the types of activities that actualize instantiations of this relationship as medial operations. The reach of the possible relations is here stretched by two extreme cases: the case that allows the two medial constellations to be substituted for one another boundlessly and the case in which two medial constellations stand in a relationship of boundless difference to one another. Let us begin with cases in which a higher degree of reciprocal substitution between two medial constellations is possible: if two medial constellations relate to one another such that the reception of K₁and K₂makes possible the same medial structuring by competent recipients, then K₁and K₂can represent each other in the sense that the two constellations in the most demanding cases cannot be distinguished from each other by the senses. In order to generate a constellation K₂for a given constellation K₁that can represent K₁in every respect, something like a perfect copy would have to be produced. So, something would have to be generated that we could, in connection with one of Danto’s thought experiments, call a materially identical double,¹⁵⁵something that, after being produced, could only be distinguished from the original with the help of knowledge of the production history. If the two products were stored by a grumpy curator in unmarked boxes, the identity of the original would be lost with the passing away of the curator because there would be no procedure left by which they could be distinguished.

If one gives up on the condition that the difference between K₁and K₂cannot be determined by any means whatsoever, another form of the substitutability of K₁and K₂can be exposed. For example, K₂could represent K₁not with respect to all attributes but only with respect to relevant ones, and precisely those attributes that are fundamental for the compositional identity of K₁might be among those. In order to guarantee this interchangeability, it is not necessary, for example, that the color of a picture K₂has the same chemical structure as that of K₁. It is enough that the light reflects in a way that is not distinguishable to the human observer.

At the level of this limited form of substitutability, different conditions for substitution come to light, depending on how the medium in which K₁was individuated is characterized. If we ask someone to read Davidson’s “The Conditions of Thought,” we will not be indignant if the person reads a photocopy of the article; for the photocopy, we at least expect, should in any case have all of the sensibly perceptible characteristics that are sufficient so that, when reading the copy, one reads the same text that is found in the book in which it is originally published (The Mind of Donald Davidson). Whether the text is blue or black plays no role, and even some poorly legible letters may be able to be deciphered in reference to the context. However, what we expect from such a surrogate K₂is that a recipient of K₂ascribes it the same compositional identity as K₁. Similarly, two performances of a musical work substitute for one another in this way if both of them enable competent listeners to identify the same score as the basis of the performances,¹⁵⁶or, as in the case of oral stories, they enable listeners to generate the performances themselves, which will be accepted by competent listeners as articulations of the corresponding musical thought.

The relations that exist between the original and the copies or between performances can be understood as relations that can be brought about by actions that instantiate a generally operative pattern that I would like to call reproduction. The following preliminary criterion attempts to define reproductions with recourse to the concept of compositional identity and to widespread intuitions:

(R₁) Given a constellation a in a medium M_a, a reproduction of a exists if and only if a constellation a* is generated in a medium M_a* ([M_a* ≠ M_a] ∨ [M_a* = M_a]), whose compositional identity in M_a* is sensibly and functionally equivalent to that of a in M_a.

Reproductions are operations that can be distinguished at two levels. At the level of physical processes perceptible objects must be produced that lead (under similar circumstances) to sufficiently similar perceptions. And the objects are sufficiently similar if competent recipients can identify the same medial constellation with their help. Because not all perceptible attributes of an object are necessarily of those attributes that are varied in a medial practice, the second condition provides the mentioned success criterion for successful reproductions. With respect to the participating media, this means that the medium in which the reproduction is carried out at least provides the differentiation possibilities that are necessary to determine the compositional identity of the constellation from which one sets out.

Setting out from this ideal type for determining reproduction, we can now set about to successively modify the substitutability relation (Vertretbarkeitsverhältnis) between medial expressions. The following, in part hopefully fictive, examples present cases in which people assume that such substitution relations are satisfied.

a. A person whom we have asked to read Davidson’s “The Conditions of Thought” read a text that was published under Davidson’s name with the title “Voraussetzungen für Gedanken.”

b. A person whom we asked to read Baudelaire’s Les Fleurs du Mal read a book with the title Flowers of Evil.

c. It turns out that the author of an essay that, according to the title, ought to contain an analysis of Kandinsky’s Bleu de Ciel (1940) composed this on the basis of a black-and-white photograph of the picture.

d. An important literary scholar publishes an article on Dante’s La Divina Commedia solely on the basis of his knowledge of Liszt’s Dante Symphonie, which was composed in 1856.

Of course, none of these cases establishes a merely arbitrary relation between the two medial constellations mentioned in each example, but these relations differ considerably with respect to whether the second-named constellation can substitute for the first. If the German translation of “The Conditions of Thought” is successful, then we do not assume that the German text resembles the English one in a way that can be experienced by the senses, but we expect that the medial constellation that constitutes the translation is true under the same conditions as the original. Or, put differently, we expect that the two texts contain the same determinations, so that the same entitlements and responsibilities accrue for someone, regardless of which of the two versions he adopts. As is well known, literary texts that do not merely make use of characteristics that are able to be articulated in terms of validity conditions, but that also make systematic use of sensible attributes pose more broad-reaching translation problems. In any case, the reader of an English translation of Les Fleurs du Mal might miss effects that the text typically has on French readers, because expressions in the context of the French-speaking community are part of a certain connotative environment that cannot be reproduced in English, or because the text possesses certain sensible aspects that are not able to be reconstructed in English (alliteration, onomatopoeia, rhythmic structures, rhyme, etc.). With a view to these characteristics, which can be of fundamental importance for the compositional identity of a work of art, it may be impossible to reproduce sensible equivalents in the medium in which the substituting medial constellation is supposed to be produced. What we expect from translations in these cases is that at least functionally equivalent forms are found.

(T₁) Given a constellation a in a medium M_a, a translation of a exists if a constellation a* is produced in a medium M_a* (M_a* ≠ M_a) whose compositional identity in M_a* is functionally equivalent with that of a in M_a.

In the case of the painting by Kandinsky, we at first glance have a similar situation, which is aggravated by the fact that the target medium is systematically poorer than the source medium insofar as black-and-white photography can reproduce brightness attributes but not color attributes. And although we can recognize some of the relevant structures of the work of art in the photograph, all of those differentiations that are connected to the color or that can only be obtained by assuming different perspectives toward the picture (the surface structure) are lost. Because the colors of the original are not able to be distinguished in black-and-white photography if they have the same brightness, the photograph cannot substitute for the original in regard to these attributes. Because these differences and the effects of the color escape the author of the essay, here we should no longer speak of a translation; the attribution of brightness values to color cannot guarantee the functional equivalence.¹⁵⁷

Finally, in writing his article, the literary scholar relies on a substitutability relation that simply does not exist. Indeed, in a certain respect Liszt’s symphonic poetry mirrors the tripartite structure of Dante’s work, and the internal structuring of the parts displays a relationship to the sections in Dante’s work insofar as the parts convey experiences with musical means that are described in The Divine Comedy. But these relationships are limited, on the one hand, by the fact that the musical structures articulate those subjective experiences that the composer had while reading the literary work; on the other hand, the musical constellations cannot reproduce those literary aspects that depend on the predicative structure of language. Even if the compositional identity of the symphony were obtained by following prescripts that relate medial attributes of the literary work to medial attributes of the musical work without ambiguity, the symphony alone could not represent the literary work. On the one hand, such prescripts could not guarantee the substitutability relation with respect to the sensible attributes; on the other hand, the substitutability relation would, for a recipient, remain dependent on the knowledge of the prescripts so that the target constellation would lose its medial independence since, under the aegis of linguistic prescripts, it would mutate into a literature that contingently uses musical medial elements.

These last two cases are ideal type instantiations of a relation that is implemented by the operation of transposition. Transpositions create medial constellations that cannot be substituted for the initial constellation; as transpositions, they are only of interest to those who know the initial constellation. If this prerequisite is fulfilled, then transpositions can provide interpretations in a broader sense, interpretations that articulate the proposals for structuring the perception of the initial constellation by means of another medium.

(TP₁) Given the constellation a in a medium M_a, a transposition exists if and only if an a* is generated in a medium M_a* (M_a* ≠ M_a) whose compositional identity exists in a conceivable relation to the compositional identity of a in M_a.

What might an operationalizable test now look like that could be used to determine which of the mentioned ideal type operations has formed the relationship between two medial constellations? One test like this could orient itself on the following question: Is it possible to find the medial constellation K₁for a given medial constellation K₂that forms the basis of the production of K₂by the implementation of one of the media operations? In light of this question, it is clear that the degree to which we hold reproductions, translations, and transpositions for invertible operations differs. For while we expect of copies and translations that it is possible to identify, among all of the possible constellations that are possible in a medium, those medial constellations that might serve as source constellations for the operations, we do not expect this of transpositions: Let us assume the existence of a medial constellation that is supposed to be a reproduction. We must expect that a medial-competent recipient is able to identify, from all of the possible constellations in a medium individuationis, those that could have served as the model. And in the case of a translation, too, it must be possible for competent recipients to single out the original from all of the possible texts in French. In short, Flowers of Evil is then a translation of Les Fleurs du Mal if one can, with its help, identify Les Fleurs du Mal from all the French texts.

If both successful reproductions and successful translations enable competent recipients to identify the initial constellation, then, with a view to the test, what is the difference between a reproduction and a translation? Although the reproduction initially looks like the more demanding medial operation, a reproduction does not necessarily imply a reference to the compositional identity of the initial constellation; even a person who cannot write could reproduce a text by sketching a copy, and, in the end, photocopiers manage this without a pattern-recognition program. However, for translations this reference to the compositional identity of the initial constellation is necessary. For even the translation of the text “The Conditions of Thought” into codes of signals can only be carried out if the medial elements of the initial text are correctly identified.

Of course, we do not expect this from transpositions; we do not expect to be able to identify La Divina Commedia with the help of the Dante Symphony. But neither do we expect that there is no intelligible relation between the two works whatsoever. What a transposition might do can be called an interpretation in a broad sense, an interpretation that provides a specific accentuation of the compositional identity of the constellation that it refers to by means of another medium.

From the perspective of the question regarding whether and how we can identify medial constellations, the criteria (R₁), (T₁), and (TP₁) can now be reformulated as follows:

(R₂) A medial constellation b in a medium M_bis a reproduction of a medial constellation a in M_a([M_b≠ M_a] ∨ [M_b= M_a]) if and only if a, in the scope of possibilities of M_a, can be identified on the basis of the sensible perceptible attributes of b.

(T₂) A medial constellation b, in a medium M_b, is a translation of a medial constellation a in M_a(M_b≠ M_a) if and only if a, in the scope of possibilities of M_a, can be identified on the basis of the compositional identity of b in M _b.(TP₂) A medial constellation b, in a medium M_b, is a transposition of a medial constellation a in M_a(M_b≠ M_a) if and only if the compositional identity of a in M_ais interpreted by the compositional identity of b in M_b.

One can complain that these reflections have a rather schematic character and ought to be refined in many respects,¹⁵⁸but it is important to see that the elementary medial operations—if they are to be able to take over a function for the study of the possible relations between media—must be characterized in a way that is not merely relying on the specifics of particular media. If the characterization of the elementary medial operations is able to help provide orientation through the jungle of the media without thereby interfering with the possibility of further differentiations, then it will have fulfilled its objective here.

Let us thus return to the problem set out at the beginning. In a first run-through, on the basis of what has been worked out in this chapter, I will attempt to show the plausibility of a media typology. Here, I will, at the same time, investigate how the everyday ways of speaking about the term medium and its theoretical ideas are related to the distinctions under discussion.

As we set out to offer a typology, the first selective criterion that is available to us for this classification work consists in examining whether something is used to individuate thoughts (i.e., A- or B-intentional states). As I have shown, this question cannot be answered independently of the question of whether something is used in a social interpretation practice as a means for individuating thoughts. As a result of the interpretationist perspective of the theory of higher-level intentional states, fulfilling this individuation criterion entails a step toward desubjectivization. If we want to answer the question whether something is a medium, then we must ask the following question: besides a being that avails itself of states of affairs of the world in a systematic way for expressive purposes, is there at least one further being that ascribes thoughts to the first one as a result of her interpretation of the expressive behavior? We do indeed expect that producing and interpreting beings are identical; however, both from a genetic and a systematic perspective, self-interpretation is derivative from the interpretation of others, and from a criteriological perspective, we must demand ascription conditions that can be observed from a third-person perspective. An answer to this question must clearly be distinguished from an answer to the question of whether something could be a medium.

I would like to characterize the following means of communication as first-order media: these are means of a social interpretation practice that help make possible articulation acts that have a compositional identity within the scope of possibilities of a medium and that are provided with content due to interpretive connecting behavior. Means of communication that play the role of first-order media in the social community are the fundamental basis for individuating and ascribing higher-level intentional states.

If one accepts the reflections with the help of which I have attempted to show that both linguistic and nonlinguistic intentional states have a right to be classified as thoughts, then a quite heterogeneous set of different media meet this first criterion, which we should be able to further sort out in a second step; for the means for individuating thoughts that have now been characterized can be further differentiated, and indeed relative to the kind of thought that can be individuated with their help. The distinction between truth-apt and non-truth-apt thoughts provides one fundamental possible way to distinguish thoughts, and it is clear that media that individuate truth-apt thoughts, that is, propositional attitudes, are languages. Hereby the set of those languages is characterized as a genuine subset of first-order media that encompasses particularly those media with the help of which truth-apt thoughts can be individuated and ascribed. Within the set of languages, for example, vocal and sign languages can be distinguished, depending on which physical parameters are varied in the expression behavior. Here, however, to be granted the status of a first-order medium, it is important that, for example, a sign language does not, for its part, fall back on routines for individuating thoughts employed by a vocal language.¹⁵⁹If we take into purview the possible relations between linguistic articulations, then the set of (natural) languages can also be characterized by the fact that it is possible to translate among all languages. Tokens of linguistic expressions cannot only be reproduced but also translated insofar as it is possible that a linguistic utterance in context C can be represented (or substituted) by another linguistic utterance that has the same truth conditions in context C.¹⁶⁰

In clear contrast to languages, as one class of first-order media, is a class of media that—like languages—can assist in individuating and ascribing thoughts; however, the difference here is that these thoughts are not truth-apt. Besides languages, with which C-intentional states can be individuated, nonlinguistic media, with which it is possible to individuate B-intentional states, comprise a further subset of first-order media; for by virtue of the criterion of compositional identity, B-intentional states possess a media-dependent identification criterion on their own. Thus, these media are potentially independent of language; their potential for individuating thoughts assumes neither linguistic agreement (for example, in the form of explicit rules) nor a necessary linguistic transfer of medial competencies.

If the full theoretical potential of elementary medial operations is exploited for determining the relationship between first-order media, then it is possible to maintain the following: in contrast to the case of linguistic utterances in different languages, it is not possible to translate between two medial constellations that are generated using nonlinguistic (artistic) media. In contrast to linguistic utterances, for nonlinguistic medial constellations the transposition is the only intermedial operation that is available. Further, however, also with a view to the relation between linguistic and nonlinguistic expressions, because of the lack of truth aptness of nonpropositional thoughts, neither can nonlinguistic expressions be translated into linguistic expressions, nor, conversely—in accord with the symmetry of the translation relation—can linguistic expression behavior be translated into linguistically independent first-order media. If we accept the idea that the characteristic of first-order media is that, with their help, it is possible to think thoughts, then the distinction between natural languages and nonlinguistic media can be explained in terms of media theory by the fact that nonlinguistic media can only be involved in intermedial relations as instantiations of transposition. In contrast, it is possible to translate between different linguistic utterances. Examples of these nonlinguistic media are, of course, artistic media such as music, dance, painting, graphics, sculpture, etc.¹⁶¹

Two types of first-order media that are clearly distinguishable are thus, first, languages and nonlinguistic (artistic) media. However, in the face of the differentiation we have made, greater problems are posed by the question of whether this provides an exhaustive catalogue of the class of first-order media or whether views, like those brought into play by sociological media theory—for example, money, power, or law—are further first-order media. In order to have a right to the status of first-order media, it must be possible to show that thoughts whose identity is linked to the position in the scope of possibilities of a medium can be individuated with their help.

It appears to me to be clear that a person in a negotiating situation who, for example, lays $954 on the table expresses in this act that she is ready to pay exactly this sum for the goods offered by the seller. Yet it is questionable whether we can ascribe to this person the intentional state that is expressed in this offer without assuming that this person has individuated this state with language. One could now claim that, independently of the established “medium” of money, this person could not find herself in the intentional state that is characterized by holding the goods to be properly valued at $954. More specifically, however, we are faced with two questions: on the one hand, it is an open question whether an intentional state is conceivable that is not able to be sufficiently determined independently of the individuation possibilities of the potential medium of money; on the other hand, it must be explained whether there could be a medium of money without the medium of language. As far as I can see, however, both questions must be clearly answered with “no.” For the intentional state of “being ready to spend $954 for W” can only be individuated by a person who has this state in the context of common beliefs, and the fact that the formulation that provides the content of the intention names a monetary amount is only understandable to the person herself if she has background beliefs of the following sort: “For $954 I have to work forty hours. In this time, I cannot produce W myself.” But also “Dollars are a lawful currency that can be exchanged for goods.” And “I would also be prepared to exchange W for 140 pounds of coffee.” All of the attributes of money relevant for the person can be linguistically expressed. Beyond this, it is also clear that money is not a medium that exists independently of language; for money—all the more so as intrinsically worthless money—must be brought into social existence in an act of explicit institutional agreement with the help of constitutive rules. Even if we need not assume that all users of money are conscious of the principal fact that it has an institutional character, well-rehearsed routines of monetary exchange can only be explained if we assume that the users, for example, know that it is against the law to produce money oneself, that an old used bill is worth as much as a new one, that a nickel has the same value as five pennies, etc. However, even the thoughts that a money user has in a specific situation in which she uses it are largely of a propositional nature; for the specific ameliorations that the use of money provide for the formation of beliefs and intentions, as well as for actions, are part of a network of linguistically individuated intentional states and can be understood as a refinement of intentional states that exist independently of money. If we assume that a natural language contains numerals, that the person in question can count, and that the person knows what an exchange is, then the thoughts of the person that are related to money can be described as the mere refinement of the thoughts that the person could also think independently of the money-specific competencies. All differentiations are the result of the fact that the person expands the evaluating vocabulary by expressions that use money as the standard of evaluation, an evaluation that is also possible independently of this standard, even if it can be articulated especially efficiently with the help of this standard. In short, intentional states that can be articulated with the help of money are not independent of language; in the best-case scenario, the domain of the intentional states that can be individuated with its help provides for an amelioration of the intentional states that need to be individuated by language. This classification of money in the context of language is also confirmed by the fact that sentences, whose relational identity is partially dependent on the fact that they contain expressions for monetary amounts, can be translated both into other languages and into those sentences that contain no such expressions, but that have the same truth conditions; for example, the sentence “P is prepared to pay $954 for this book” can be translated into the sentence “P is prepared to exchange 140 pounds of coffee for this book.”¹⁶²Because all of the other phenomena—that in the context of sociological theory formation should be understood in accord with the example of money—are integrated into a holistic context of explicit institutional agreements to the same degree as money, a detailed discussion of phenomena such as power, influence, or law is unnecessary here, especially since the generalization procedure, to which they owe their classification as media, is extremely dubious.¹⁶³

Higher-order media can be differentiated from first-order media, which are the irreducible means for individuating thoughts. Particular to such higher-order media is that their medial elements—or a set of medial elementary constellations—exist in a sufficiently unambiguous classificatory relation to the medial elements of a medium M₁, which is already used in an interpretive communicative practice to individuate thoughts. Let us assume a community in which a vocal language is spoken that consists of twenty-six types of phonemes; then a higher-order medium (M₂) could classify every type of phoneme into another event type or state type, for example, per character or per elementary constellation of characters. Here the event types or state types that form the medial elements of M₂have different physical attributes than the medial elements of M¹.

Higher-order media, like phonetic transcripts or notation and medial codes (codes of signals, ASCII), assume that there are already medial constellations that have been individuated in first-order media; with a view to the compositional identity of these medial constellations, they can prove their value as target media for translation operations. Insofar as the successful use of higher-order media is dependent on equivalence relations or reproduction relations between sets of medial elements (or elementary constellations), and thus on compliance with explicit prescripts, I view higher-order media as linguistically dependent media that can only arise historically if there is a medium in which the reference to the first-order elements of the medium is possible as well as the individuation of explicit prescripts.

With higher-order media, the possibilities for communication that are available in a social community can be considerably broadened; in particular, the physical attributes of elements of second-order media can allow medial constellations whose performance within the framework of first-order media is elusive to be translated into forms that endure longer than the physical implementations in a first-order medium. But even in writing, where this is managed, and where articulations of greater dimension and greater complexity are made possible, articulations in second-order media remain connected to the potential to individuate thoughts in first-order media. In other words, it is not possible to write something that one could not (internally) say or sing.

If, against the background of the previous reflections, we ask how that which is called media in the McLuhan tradition is related to the differentiations made here, then, in the face of the basic classification criterion—which draws its classifying power from its connection with the conditions for individuating thoughts—two answers must be provided. Some of what the McLuhan tradition refers to as media belongs to higher-order media; other things are not media at all, because they do not have anything to do with individuating thoughts—they are things that we simply use as tools, which are based on the implementation of physical and not medial relations. However, it is worthwhile to differentiate the class of tools so that it is at least understandable how McLuhan could come up with the idea of presenting such heterogeneous phenomena as language, the alphabet, print, and radio, but also electric light, electric energy, paper, and the wheel as examples of media. While the alphabet and script can be classified as second-order media, we must view the wheel and electric light simply as tools that exist without a direct, significant connection to media. For radio, television, paper, and print, things may look different. In these cases we are dealing with tools or technologies that, at least, play a role in medial practices.

On the one hand, we can observe that tools and technologies for generating physical realizations of medial constellations can play an important role. If, for example, we take second-order medial constellations such as written texts into purview, a surface is needed on which a text can be written or printed. Here writing instruments and writing surfaces play the role of medial tools; diverse physical objects and materials can fulfill this role.¹⁶⁴Even if different tools for implementing medial constellations might have different advantages and disadvantages with a view to the durability, the power expended, the readability, etc., these differences are hardly of conceptual significance, even if innovations in these areas can have broad-reaching social consequences and can be directly relevant for aesthetic reception.

On the other hand, however, this assessment raises the question of whether we should analyze phenomena like photography, film, radio, telephone, television, video, etc., in the same way or whether qualifying them as sets of medial tools is to do too little. Because an answer to this question raises numerous specific problems, I will here only suggest a criterion with the help of which the question can be addressed in principle.

First, with a view to the phenomena mentioned, we should clarify what we are talking about and avoid the idiosyncratic ambivalence that is implicit in the expressions mentioned. To do this, we should distinguish between the medial tools that, in the most demanding case, are necessary means for physically implementing medial constellations and the medial constellations themselves, which we, for example, call radio dramas, photographs, films, or videos. Medial tools in this sense can be microphones, cameras, developers, cathode ray tubes, speakers, that is, any of those technical means that lead to the implementation of a perceptible product and in this respect stand in the tradition of paintbrushes and violins. The use of technical instruments that serve the reproduction and distribution of physical implementations of medial constellations should be distinguished from the necessary role that technical instruments can play as medial tools. These tools are the arsenal of technical devices that motivate one of the popular everyday ways of applying the media concept and that in common use do not serve the initial production of a medial constellation, but, like photocopiers, serve reproduction, or, like radio systems, serve the distribution of physically implemented medial constellations. In the framework of the perspective suggested here, we can call these tools intermedial tools.

Figure 3.6 A typology of media

The suggestion here is not characterized by a particular sensitivity to the details that play a role in the context of relations between medial constellations and the processes of physically implementing them; nevertheless, with its help, a perspective can be distinguished in which stable distinctions can be established between media and medial constellations, on the one hand, and medial tools, on the other hand. For in contrast both to the medial tools and to the intermedial tools, only perceptible products that are able to be experienced are medial constellations, which have identities within the scope of possibilities of their medium. Whether the concern is with medial constellations in nonlinguistic or linguistic media depends on whether such products are caused by B- or C-intentional states. Because media are understood as limited sets of learnable alternatives for action that bring about intersubjectively perceptible states of affairs in the world or events that implement medial attributes, media-specific action alternatives can include the use of tools or instruments. But the tools are not media.

If I am correct, on the basis of the fundamental description of media as a means for individuating thoughts, we obtain a perspective that allows us a reasonably organic classification of media and tools. For an overview, see figure 3.6.