The Sound of Meaning (Part II)
In my last post, I talked about how the human mind can infer meaning from sound alone. As far as I know, this is a dimension of intelligence that has yet to be explored by the AI community. It is, however, an ability that researchers will need to mimic if they want to achieve a true form of artificial general intelligence (AGI).
In his book, The Poem: Lyric, Sign, Metre, Don Paterson offers another interesting example of how humans infer meaning, based solely on the sound of a word. Paterson asks the reader to consider this situation:
You are in a spaceship approaching a planet. You've been told that there are two races on it, one beautiful and friendly to humans, the other unfriendly, ugly, and mean-spirited. You also know that one of these groups is called the Lamonians; the other is called the Grataks. Which race is beautiful and friendly? And which race is unfriendly? (p. 32)
So which one did you choose as the beautiful, friendly race? Was it the Lamonians or the Grataks? Most English speakers, of course, chose the Lamonians, given that the word has a softer, more fluid sound. The harsh, guttural sound of Grataks fits better in the ugly / mean category. And how about those Klingons from Star Trek? The minute you heard the name, you knew trouble was brewing.
This highlights an interesting phenomenon. In poetry, the sound of a word itself can convey meaning, and human intelligence has no problem arriving at a conclusion, given but a single word with a unique sound. But can a LLM like ChatGPT do the same thing? So, I decided to ask ChatGPT this same question to see if its inferential skills matched that of a human. With nothing more than the question as written above, ChatGPT responded, “I’m sorry, but without more specific information, I can’t determine which race corresponds to which description.”
Clearly, the model needed an inferential assist. So, I provided one, noting that the sound of the two words was important. This proved decisive. Chat GPT responded that ‘Lamonians’ sounds like it could be associated with ‘luminous’ or ‘luminosity’ while ‘Grataks’ has a harsher, less melodious sound. Given those facts, it made an inference roughly analogous to that of a native English speaker. The Lamonians are to be trusted, the Grataks not so much. Here’s the dialogue:
Inference Assist
For a human, this question is easy without further information. It has to do with the way the two names are pronounced. Does that help you?
ChatGPT
Ah, I understand now. Based on the phonetics of the names, it's a linguistic clue rather than specific cultural or contextual information. "Lamonians" sounds like it could be associated with "luminous" or "luminosity," implying beauty and brightness. Conversely, "Grataks" has a harsher, less melodious sound, which might suggest something less pleasant or friendly. So, based solely on the phonetics, one might infer that the "Lamonians" are the beautiful and friendly race, while the "Grataks" are the unfriendly, ugly, and mean-spirited race.
This experiment underlines a key difference between human intelligence and its machine imitator. Most English-speaking humans could arrive at a conclusion in one fell swoop. The LLM, on the other hand, needed extra information as it did not know what to focus on. This makes sense, given that the model lacks an embodied ability to hear words like a human. Even so, one might imagine a day when an AGI system could convert words into spectrographs and then use those visual images to infer the meaning of the sound.
You may have noticed that I limited my generalizations to English speakers in the previous paragraphs. That was intentional. A faculty colleague here at the University of Florida (Gabriele Belletti) – a native Italian speaker – mentioned that the norms governing the meaning of sound could differ across languages. In some cultures, the word ‘Grataks’ might sound more melodious, leading to the opposite outcome from the one observed for English speakers.
Our research team has yet to translate this example into Italian and test it on ChatGPT. But that is certainly on our to-do list. In theory, a truly multi-lingual form of AGI would take relevant cultural norms into account when deciding on the sound meaning of any given word. Who would have imagined that the “intelligence” of sound could be so complex?