“Um”: Language as Behavioural Heuristics

 “Language is a tool that has been worn into shape by continual use.” (Guy Deutscher)

I was struck by many of the insights shared by Guy Deutscher in The Unfolding of Language, and especially two findings into how languages develop over time. Firstly, languages with the most complex word-structures tend to be the ‘exotic’ tongues of simple tribal societies, typical spoken by a few hundred people only. Secondly, early stages of language are usually much better ‘behaved’ than later stages, with fewer irregularities and more complex structures. Put another way, why do linguistic structures seem to disintegrate over time?

On the first of these he hypothesizes that smaller societies have less pressure to simplify because of contact with ‘strangers’ who speak different dialects or even different languages, and also that ‘literacy’ (i.e., use of the written word) may hinder the fusion of words and development of more complex word-structures which happen when languages are only spoken.

On the second, he argues that languages are always under pressure to save effort and make themselves more ‘economical’, especially for the most frequently used words. Also, he writes that there is pressure to heighten the effect of an utterance and make it more expressive.  This leads to finding more compact and efficient ways to express those things that are used most often and the drafting of nouns and verbs into use as metaphors for abstract concepts. This implies that language because more abstract over time while meanings and sounds are eroded.

In How We Talk, N.J. Enfield shows that much of everyday conversation is about signaling. Consider that language lives and breathes in conversation and it’s where we all learn language much more than in grammar classes. 

He has used his stopwatch and other techniques to record and analyze how conversations work. For example, in a conversation the average time to respond to a question is 200 ms (the same time as blinking an eye). “No” answers are slower than “Yes” answers in all languages. And there is 1 second ‘window’ to respond in a conversation, informing the speakers whether a response is fast, on time, late or unlikely to happen at all. 

The title of this article references that, on average, someone will say “Huh?”, “Who?” or another similar expression every 84 seconds in conversation to check on what is being said (regardless of where you are in the world).

One in every 60 words is “Um” or “Uh” (or something which looks and sounds remarkably similar whatever language you speak). While some might argue that this is not really language and language teachers may prefer “Pardon?” or “Excuse me?”, most people don’t speak that way.

What all this shows is that all languages have quite consistent conversational  ‘rules’. These rules are just as important as the rules of grammar and fairly consistent across different cultures. 

Enfield points out that in conversation, the flow and change of speakers seems to be regulated by behavioural cues:

  • Mostly, one person speaks at a time
  • Sometimes people speak in overlap, but never for long
  • Often the transition is tidy with no audible gap or overlap
  • The order in which people speak is not predetermined and varies
  • How long each person speaks is not predetermined and varies
  • The length of conversation is not specified in advance
  • What people say is not specified in advance
  • Sometimes it is clear who should speak next (e.g., when a direct question is asked) and sometimes it is not

As mentioned, transitions from one speaker to another typically take 200ms and most occur within plus or minus a quarter of a second of that (i.e., with a very small overlap or at most a half second gap). Speakers signal (non-verbally) when they’re about to finish and listeners are aware of when this will happen (often through a lower pitch or loudness). 

One second is a “standard maximum silence”, beyond which things become awkward, and while a response of “Yes” takes 35ms on average, a response of “No” takes a much longer 600ms. There are various (and maybe multiple) explanations of this difference which may reflect mental processing time, social politeness or the signaling of a ‘qualification period’.

This is where “Um” comes in. “Um” typically takes 670ms to respond, while “Uh” typically takes 250ms to respond, and Enfield argues that these are different signals for ‘major’ and ‘minor’ pauses in the conversation. “Huh?” is another word in this family of conversation signals, which is pretty universal and can be uttered with minimum delay and effort (which is why it is used). 

These findings add to my conviction that language is just another behaviour, and that much of conversation is about managing social interactions. Thus, conversation is as much about behavioural heuristics and social signalling as it is about sharing information. As with all other cultural tools, language changes over time, and much of the change is about making it simpler and quicker to interact, as we also see with changes in the language of messaging and use of emojis. 

To paraphrase Guy Deutscher, all languages change over time unless they’re dead.


The Unfolding Of Language: The evolution of mankind’s greatest inventionby Guy Deutscher

How We Talk: The inner workings of conversationby N.J. Enfield

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *