An AI Language Learning Vision
I recently purchased two language textbooks in the Nuovissimo Progetto Italiano series. The series has five books, one for each level of the Common European Framework of Reference for Languages (CEFR). In this case, that’s AI (Beginner), A2 (Elementary), B1 (Intermediate), B2 (Upper Intermediate), and C1 (Advanced).
There’s a lot to like about Nuovissimo. For starters, everything in these textbooks is written in the target language, even at the beginner level. I tend to favor total immersion language learning experiences because they work. Middlebury College pioneered this approach to 2nd language acquisition. When students enter their summer program, they take the Language Pledge, promising to only speak in the language they are learning. Middlebury is serious about this. Those who break the pledge are quickly asked to leave. Although immersion is a personal preference, I’m not dogmatic about it. There are excellent textbooks that explain concepts, usually grammar, in English. PassaParola from Lingro Learning and Immagina from Vista Higher Education are highly recommended.
In my July 24, 2024 post, I introduced the idea of comprehensible input, a specific kind of total immersion. Stephen Krashen is a prominent advocate of this approach. Comprehensible input is based on what Krashen calls his “input hypothesis.” He writes, “The input hypothesis makes the following claim: a necessary condition to move from stage i to stage i + 1 is that the acquirer understands input that contains i + 1, where ‘understand’ means that the acquirer is focused on the meaning and not the form of the message.” He then states the idea more simply. We grow as learners “only when we understand language that contains structure that is a ‘little beyond’ where we are now (p. 21).
The idea of comprehensible input highlights the critical importance of meaning-making in the learning process. Experts largely agree that one needs to understand about 85 to 90% of what is being said or read to progress in their language learning journey. Context is king as it gives the learner reference points – foot and hand placements embedded on the climbing wall of language acquisition. Too few placements and upward movement cannot happen. When one understands too little, 20% or less of the incoming input, the flow of information becomes nothing more than undifferentiated noise and growth stalls.
Since Krashen proposed his theory of 2nd language acquisition some forty years ago, numerous studies have been conducted to validate it. The results are in, and the evidence suggests that Krashen was largely correct. So, how might an AI-enabled language learning system deliver a comprehensible input learning experience?
To answer that question, I must talk about my Italian language learning journey and what I’ve learned so far as a student. I started studying Italian in early 2024, about a year and a half ago. I’m a diligent student and typically study at least one hour daily. I completed the Pimsleur Italian program a month ago and regularly meet with Elena Ribaudo, my incredible iTalki instructor. My best guess is that I’m at the B1 / B2 level.
Starting out, what I needed was structured learning experiences that presented about 15 to 20 new vocabulary words in a ~20-minute time span. Pimsleur units last about 30 minutes, but their program is completely aural. Thus, I added a textbook to the mix. The typical foreign language text features vocabulary lists at the beginning of each unit, and I found those helpful. What I wanted, though, was an experience that combined the two, the aural + the visual. Of course, an in-person lesson with a native speaker is the gold standard, though that can get costly.
This is where AI might be able to help. Let’s reconsider my situation. My AI research team recently created a prototype of a digital world populated with multilingual digital humans. In this digital world, we added the two digital humans shown here.
Thanks to the ConvAI technology stack, both Adelena and Daniele are fluent in English, French, German, and Italian. ConvAI supports many other languages, but the limit is four in any given environment. As in a game, students can either speak to the digital humans or text them. Adelena and Daniele will respond verbally, and the system provides a transcript of what they said. As a learner, I find this feature especially helpful. The ConvAI-generated accents are convincing, too. One of our language professors, fluent in French and Italian, was pleased with what he heard. As pictured here, both humans are dressed in Renaissance attire because they “live” at Forni Cerato, the 16th-century Palladian villa we’re recreating.
This version of the learning experience is unstructured. A student can visit any part of the Forni Cerato estate and ask Adelena and Daniele anything. It’s like being in Italy and attempting a conversation with a native speaker. These kinds of experiences are best suited for advanced learners, students at the CEFR C level or higher. But what about beginner and intermediate students? A bit more structure is what they need. To do that, we’re developing a second prototype where the digital human takes the student on a short tour of a park, an apartment, a store, or other culturally significant locations. During that tour, the digital human introduces the student to about 20 new words. We envision discrete places in the story where a student can ask questions of their guide, though these will be limited and suitable to the student’s level of understanding. We’ve already discovered that our development environment supports context-specific labels. As one approaches an object, for example, its name pops up. Everything, including the tour script used by our digital guides, is in the target language, though Adelina and Daniele can also respond to questions posed in English. And finally, a gaming format allows us to embed interactive exercises and quizzes.
All this is made possible by AI technologies that power speech-to-text and text-to-speech functions, with large language models (LLMs) serving up responses from behind the curtain. In this case, the application of advanced AI technology looks promising, at least in the foreign language classroom.