Adapting spaced repetition to interruptions in usage: Even without parsing the user’s responses (which would make this robust to difficult audio conditions), if the reader rewinds or pauses on some answers, the app should be able to infer that the user is having some difficulty with the relevant material, and dynamically generate new content that repeats those words or grammatical forms sooner than the default.
Likewise, if the user takes a break for a few days, weeks, or months, the ratio of old to new material should automatically adjust accordingly, as forgetting is more likely, especially of relatively new material. (And of course with text to speech, an interactive app that interpreted responses from the user could and should be able to replicate LanguageZen’s ability to specifically identify (and explain) which part of a user’s response was incorrect, and why, and use this information to adjust the schedule on which material is reviewed or introduced.)
Seems like this one is mostly a matter of schlep rather than capability. The abilities you would need to make this happen are
Have a highly granular curriculum for what vocabulary and what skills are required to learn the language and a plan for what order to teach them in / what spaced repetition schedule to aim for
Have a granular and continuously updated model of the user’s current knowledge of vocabulary, rules of grammar and acceptability, idioms, if there are any phonemes or phoneme sequences they have trouble with
Given specific highly granular learning goals (e.g. “understanding when to use preterite vs imperfect when conjugating saber” in spanish) within the curriculum and the model of the user’s knowledge and abilities, produce exercises which teach / evaluate those specific skills.
Determine whether the user had trouble with the exercise, and if so what the trouble was
Based on the type of trouble the user had, describe whay updates should be made to the model of the user’s knowledge and vocabulary
Correctly apply the updates from (6)
Adapt to deviations from the spaced repetition plan (tbh this seems like the sort of thing you would want to do with normal code)
I expect that the hardest things here will be 1, 2, and 6, and I expect them to be hard because of the volume of required work rather than the technical difficulty. But I also expect the LanguageZen folks have already tried this and could give you a more detailed view about what the hard bits are here.
Automatic customization of content through passive listening
This sounds like either a privacy nightmare or a massive battery drain. The good language models are quite compute intensive, so running them on a battery-powered phone will drain the battery very fast. Especially since this would need to hook into the “granular model of what the user knows” piece.
Seems like this one is mostly a matter of schlep rather than capability. The abilities you would need to make this happen are
Have a highly granular curriculum for what vocabulary and what skills are required to learn the language and a plan for what order to teach them in / what spaced repetition schedule to aim for
Have a granular and continuously updated model of the user’s current knowledge of vocabulary, rules of grammar and acceptability, idioms, if there are any phonemes or phoneme sequences they have trouble with
Given specific highly granular learning goals (e.g. “understanding when to use preterite vs imperfect when conjugating saber” in spanish) within the curriculum and the model of the user’s knowledge and abilities, produce exercises which teach / evaluate those specific skills.
Determine whether the user had trouble with the exercise, and if so what the trouble was
Based on the type of trouble the user had, describe whay updates should be made to the model of the user’s knowledge and vocabulary
Correctly apply the updates from (6)
Adapt to deviations from the spaced repetition plan (tbh this seems like the sort of thing you would want to do with normal code)
I expect that the hardest things here will be 1, 2, and 6, and I expect them to be hard because of the volume of required work rather than the technical difficulty. But I also expect the LanguageZen folks have already tried this and could give you a more detailed view about what the hard bits are here.
This sounds like either a privacy nightmare or a massive battery drain. The good language models are quite compute intensive, so running them on a battery-powered phone will drain the battery very fast. Especially since this would need to hook into the “granular model of what the user knows” piece.