Transmission and the Individual

Cultures are built to contain the mutually agreed-upon yet evolving impressions of conversations, not monologics of any kind—neither tyrannical nor swarm.

Loading the Elevenlabs Text to Speech AudioNative Player...

I think it was a 'blue collar' comedian telling a 'driving directions' joke that I am remembering when thinking about individuality. The comedian was making fun of his passenger's—his wife's or girlfriend's actually—apparent inability to even comprehend the concept of providing directions: "take the blue road up." This was the punchline anyway—uttered by the incompetent passenger to the driver, while staring at a map.

Both its noticeable callback to an old misogynistic trope and the way the joke 'worked' in general demonstrate the same idea: it is difficult, if not impossible for individuals to communicate their individual ideas—and for anyone to even understand individual contributions or individuality in general—without joint attention and joint representation.

In their book What Makes Us Social? Chris and Uta Frith explain and flesh out both of these terms:

We-mode [joint attention] is more than the phenomenal experience of each of the individuals concerned. We use the term to indicate that it allows a type of joint representation, a We-representation, that is well below phenomenal awareness. . . .

Imagine yourself sitting at a table with a glass and a jug of water quite far away (or whatever the image shows). It is too far away for you to reach . . . Now imagine another person joining you. If they can reach the jug, then they could pass it to you. They put you in a position to get things . . . When you are alone, only objects within your reach elicit increased excitability in your motor cortex. But when another person is present, objects within his or her reach also elicit increased excitability in your motor cortex.

Your action priority map [a mental map, "indicating the location of all the graspable objects, a representation of the immediate environment in a form relevant for action (Klink, Jentgens, and Lorteije, 2014)"] is now a We-representation [joint-representation], including you and the companion as one, combining self-related and other-related action maps. In this sense, each person takes on the role of the monkey's rake for the other. . . .

['Monkey's rake' refers to] an inspired experiment conducted by Atsushi Iriki at the world-famous Riken Centre for Brain Science in Tokyo. He taught monkeys to use a rake to reach objects that were too far away to grasp (Iriki, Taneka, and Iwamura, 1996) . . .

There are neurons in the parietal cortex of the monkey brain, which provide maps [action maps] of objects near the hand. If an object appears near the monkey's hand, then these cells start to fire. This firing activity relates to objects that can be easily reached . . . As soon as the monkey learned to use a rake, the receptive field of the neurons expanded enormously, but only when the monkey was holding the rake. The action map now includes objects that are way beyond the hand, but near the top of the rake. When holding a rake, more objects become within reach, and this is reflected in the activity of the neurons. . . .

Just as in the example with the monkey's rake, the action space [around you and your companion at the table] has proved to be malleable. It can be represented either for an individual or a group. The map (or rather, its representation) highlights all the objects within our reach. So long [as] we are both in the We-mode, our action maps will be aligned. Each of our brains is using a representation that includes both our reaching possibilities. . . . Collective representations of our environment are created during many situations.

It will pay enormous dividends for us to be in awe of this capacity for joint-representation, so I will linger on it, in awe, for a moment. Suppose there are multiple people around the table with the glass and jug of water. If each person were at the table in "I-Mode"—a 'private' perspective that generates a private representation of the situation—they would be in a situation where they each 'see' the glass and jug differently—'in front,' 'off to side,' 'in reach,' 'out of reach,' and so on. Yet, when they switch to joint attention, the perspective completely changes (not simply as a sum of the different viewpoints), and this forces the representations to change as well. Now, when in We-mode, or first-person-plural perspective, we 'see' the glass and jug differently, 'objectively,' in a way that we must find words and phrases for. A grunt of 'in front' could possibly represent your intention in I-mode, but in We-mode, you instinctively know that won't work (everyone has different fronts)—and in any case, you don't 'see' the objects as simply 'in front' when you are in We-mode. You can say 'in front of me,' but you have left I-Mode to do it. The phrase 'in front of me' reflects the first-person-plural view of We-mode, not the private perspective of I-mode.* You can also say 'near Alice,' 'at the west end of the table,' 'over there.' All of these words and phrases and concepts for locating things (to variable levels of precision)—these joint-representational concepts—are required by joint attention. No one around the table sees the glass and jug from their own private perspective (when in We-mode), so they need joint representations to represent what they are seeing as a group. And with joint representations will come norms.

What happens when we add memory and time to this situation? Now the people around the table can remember the joint representation and talk about it and parts of it, even though they are no longer at the scene—and maybe not in the same location in spacetime at all. Others can join in, learn the joint representation, connect and contribute to it ("Never place water near the edge of the table"), attack it ("everyone thinks Bill was there, but he's on vacation"), amplify it by repeating it, and so on. We have created a joint-representational plane, which we will proceed to fill with all kinds of remembered joint-representations, all the 'objective' knowledge of humankind, through conversations. All of that because of the magic of joint attention:

There is an interesting family resemblance between, on the one hand, individual representations and We-representations, and, on the other hand, egocentric and allocentric spatial frames of reference. The individual representation is equivalent to an egocentric frame of reference in the way that it links the individual to objects in his or her environment: "This is where the object is in relation to me." . . .

In [the allocentric] frame of reference, objects are related to one another rather than to the self. For example, I might represent the mug as being near the edge of the table rather than near my left hand. I might also represent the object as being to the north rather than to the left of me. At first sight, there might seem little similarity to a We-representation. There is nothing specifically social about such a representation, but the key feature of the allocentric frame of reference is that it is independent of any particular viewpoint. This means that the representation is the same for all the people present. It is effectively a We-representation, ideally suited for group endeavors.

We understand all of this about the workings of the joint-representational plane implicitly and automatically because it's precisely what makes the driving joke work (or at least function)—and not work. The passenger hilariously uses a private, egocentric perspective and language ('blue road,' 'up') instead of the expected joint-representational language and allocentric frame of reference ('Interstate 35,' 'north') when providing someone with navigational assistance. Less hilariously, the joke itself (winking at a 'genre' with a history) likely uses the 'private, egocentric' (now, to the audience) lady-driving-bad trope in talk—jarring to some audiences for reasons spanning from broad changes in cultural sensitivity to boredom with cliché.

Of course, the 'private' intentions of the passenger inside the joke and the comedian telling it (or the joke itself) are not total gibberish. The driver, depending on the situation, could have figured out "next major highway, north" from the passenger's message. And maybe the audience could have (or should have) figured out that the comedian is promoting misogyny, if only they noticed the dog whistles.

How do we typically, naturally coordinate around these issues of speaking and listening (i.e., teaching and learning)? Do we typically allow people to speak or honk in whatever private way they wish, using whatever symbolic code they want, with as much or as little content as they'd like, leaving the task of interpretation fully to the listener? Or do we expect speakers to adhere strictly to joint-representational grammars and languages and provide a complete, explicit picture to the listener? The answer, as we all know, is that we typically expect strict adherence and mostly complete pictures from speakers, with some variable forgiveness in the joints for 'degenerate' expressions and ideas. This is evidenced by the fact that we struggle to comprehend texts when just 2–5% of the vocabulary is confusing to us:

Perhaps the most compelling evidence for needing [around] 95–98% comprehensible input comes from vocabulary studies . . . Batia Laufer (1989) found that learners generally need to understand at least 95% of the words in a text to adequately grasp its meaning​. At about 95% lexical coverage (i.e. only 1 unknown word in 20), readers could get "adequate" comprehension, whereas below that threshold comprehension dropped dramatically​. More recent research has pushed the target higher: Hu and Nation (2000) concluded that around 98% vocabulary coverage may be necessary for full, unassisted comprehension​. In a controlled study, Hu & Nation presented learners texts with varying percentages of known words; the learners generally needed to know 98–99% of the words to answer comprehension questions satisfactorily, whereas at 95% many struggled​. Norbert Schmitt et al. (2011) reinforced these findings in a large-scale experiment with 661 learners, noting a nearly linear relationship between vocabulary coverage and reading comprehension—as the percentage of known words rose, comprehension scores rose in tandem. They found no sudden "cliff" but did argue that 98% coverage is a more reasonable target for comfortable reading of academic texts​.

And also by the fact that speakers in a conversation, as opposed to listeners, are expected by the group to be accountable for making 'repairs' to their contributions:

Although there is a distinction between self-correction and other-correction, self-correction and other-correction are not alternatives. Rather, the organization of repair in conversation provides centrally for self-correction, which can be arrived at by the alternative routes of self-initiation and other-initiation—routes which are themselves so organized as to favor self-initiated self-repair.

There is also the 'classic' style of prose, grounded in human biological and social patterns, that describes the inherent role of the speaker-teller as the community has formed it, reminding us that windbaggery is not something we have to accept as a positive characteristic of speakers:

The guiding metaphor of classic style is seeing the world. The writer [speaker] can see something that the reader [listener] has not yet noticed, and he orients the reader's gaze so that she can see it for herself. The purpose of writing is presentation, and its motive is disinterested truth. It succeeds when it aligns language with the truth, the proof of success being clarity and simplicity. The truth can be known, and is not the same as the language that reveals it; prose is a window onto the world.

H.P. Grice, in the 1970s, uncovered a set of conversational maxims, or norms, which we naturally follow, with these expectations for the speaker-teacher:

(1) Make your contribution as informative as is required (for the current purposes of the exchange), (2) Do not make your contribution more informative than is required . . . Try to make your contribution one that is true . . . (3) Do not say what you believe to be false, (4) Do not say that for which you lack adequate evidence . . . Be relevant . . . Be perspicuous . . . (5) Avoid obscurity of expression, (6) Avoid ambiguity, (7) Be brief (avoid unnecessary prolixity), (8) Be orderly. And one might need others.

And linguist Daniel Dor describes natural speaker duties, limitations, and the end goal of speaking in his theory on the function of language as a tool for "instructing the imagination":

Speech acts do have specific formal properties, but they have them not because they reflect an inherent logic of communication, but because they are mutually identified norms for communication—conventionalized norms that are necessary because (not in spite) of the fact that the experiential world within which linguistic communication has to take place looks just like Derrida describes it ["opaque, indeterminate"]. . . .

[These norms for communication] are statements of what speakers should do in their linguistic behavior, and they behave exactly like norms in other social domains (Bicchieri 2006, Gilbert 1989). They impose collective demands on individual speakers to behave in ways that very often contrast with their own communicative inclinations. . . .

An instance of imagination-instruction [teaching, speaking] is successful to the extent that the listener constructs an imagined experience that is similar to the speaker's intent.

The natural role of the human speaker, it seems then, which we all occupy dynamically in speaking and listening, teaching and learning, is one of near servitude to the listener-student, but it's also one that can be aptly described as the "master of the offer" in contract-law terminology. The speaker is responsible for laying out nearly all important details (terms) of the entire experience he intends to transmit, and it is this transmission, however temporary or weak—not the receiver's thoughts or inferences—which is to be the complete fodder for the listener's reconstruction of the experience.

If the speaker intends 'funny and inoffensive way to talk about being spatially challenged,' and the listener hears 'offensive way to talk about being spatially challenged,' the listener is free to ignore the speaker's contributions to the joint-representational plane, reject them, or accept them (understand them), but the listener is typically not free, without sanction, to materially modify, adumbrate, substitute, add to, or remove meanings from the speaker's message ('read in' things that aren't there or ignore important things that are). The ultimate goal of the exchange is for the listener (audience, driver) to process and then modify, reject, accept, the speaker's (comedian's, passenger's) expressed intent—to get the intended joke or directions, and laugh, or not. It is not for the listener to process and then proceed to modify, reject, or accept, his own unexpressed inferential reworking of the speaker's message. This would allow listeners to act as co-counsel (sneaky speaker) and judge (listener), rather than, appropriately, just as judge, around the meaning of the speaker's intent. No matter how 'fair' they are, we simply don't condone judges trying to make the law (or listeners trying to materially change the message)—surreptitiously sneaking in their own inferences about it and then evaluating it—they are expected only to apply the law, to interpret the message. For judges, they're stuck with this listener role, but everyone else can, ceteris paribus, etiamsi non sint, have their turns as speakers as well. It is in that role that they can do their work, with their listeners, on the joint-representational plane, modifying, adding, removing. This may instigate further rounds of modification, addition, and subtraction. It's called civil, mature, turn-taking conversation, colloquy. A listener can become a speaker in the same person at nearly the same moment, but those very different roles do very different things (and thus people do very different things inside them), require different burdens, and, ultimately, provide different but necessary functions in society.

The law is a useful analogy, indeed, since it reminds us that we live under an adversarial system. The products of those adversarial encounters, in law, are legal meanings. In life, the products of conversation are, inevitably, as far as we can ever see, cultural meanings. When we "ghost" the speaker role, as we do when we allow the listener to play both speaker and listener roles, we remove necessary confrontation ("together forehead") and diversity from the system, leaving culture poorer for it. The joint-representational plane is built to contain the mutually agreed-upon yet evolving impressions of conversations, not monologics of any kind—neither tyrannical nor swarm.

Just as a driver cannot navigate by the private whim of "the blue road up," a culture cannot navigate the complexities of shared existence by abandoning its common maps. The duty of all speakers, when they are speakers, is to draw those maps with clarity and care, to translate their private views into public ways. When we absolve speakers of this duty, or empower the listener to ignore parts of the map in favor of their own imagined landscapes, we are all set adrift. The result is not greater individual freedom, but a collective state of being profoundly and irrevocably lost.

Next
Next

The Culture of Childhood and the Science of Learning