or how many steps are left till we have an education platform for everyone
I’m trying to figure out how to build a universal education platform.
I don’t know how to do it.
By a ‘universal education platform’ I mean a system that allows anyone to learn anything and everything.
That’s a little ambitious.
So for argument’s sake, let’s drop some of the most obvious constraints and imagine our target student is healthy, literate, and can sit behind a computer for at least an hour a day. Let’s also say the system can teach 80% of people 95% of what they would be able to learn given a top notch personal tutor.
What we have then is a Learn Everything System (LES)[1].
How would it work and why don’t we have it?
My guess is that LES is an AI tutor controlling a rich digital simulation. By that I mean, it’s a game-based[2] learning experience orchestrated by your favorite teacher feeding you all of human knowledge.
It doesn’t exist cause neither the AI tutors nor the digital simulations are strong enough.
Yet.
So let’s build LES, though I’m not sure yet how.
That said, I think it’s worth looking at what it would take and what the steps in between would be. I suspect the crux is how to create that ideal AI tutor, cause the simulation part will likely solve itself along the way (we already have generative AI that looks like it’s playing Doom). And to that end, we need to understand a little bit more about how learning works.
A LES Model of Human Learning
Like any self-respecting researcher, I started my exploration of education with a deep dive into the literature.
Then I ran away screaming.
The field is so sprawling that I’m not sure a 4 year PhD would actually get me the insights I was hoping for. And that’s skipping the mortifying realization of how hard the field has been hit by the replication crisis[3]. So instead I built my own model of learning and asked researchers and entrepreneurs in the field if it made sense to them. Twelve conversations later, and this is where I ended up:
You can model learning as consisting of 6 factors—Content, Knowledge Representation, Navigation, Debugging, Emotional Regulation, and Consolidation.
Content is what you learn.
Knowledge Representation is how the content is encoded.
Navigation is how you find and traverse the content.
Debugging is how you remove the errors in how you process the content.
Emotional Regulation is how you keep bringing your attention back to the content.
Consolidation is the process of making the content stay available in your memory.
So what are we missing if we want to create the LES AI tutor?
LLM’s tick most of the boxes: They are trained on much of the internet (Content), can paraphrase the material till you can follow along (Knowledge Representation), are able to suggest an entry point on nearly any study topic and hold your hand throughout (Navigation), and will explain any problem you are stuck on (Debugging).
But.
They won’t help you keep your attention on task or stay motivated to learn (Emotional Regulation)[4].
Of course, you can ask it to do that. But for most people, by the time they notice their attention has drifted, it’s too late. And noticing is a hard skill in itself.
In contrast, imagine the best teacher you ever had as your personal tutor. They’ll subconsciously track your eye gaze and facial expressions, adjusting their words to your engagement. They’ll drum up examples that connect to your experience. They’ll even proactively offer just the right type of task to get you processing the content more deeply—an exercise, a break, or maybe even some extra reading.
You might wonder if teachers actually think they do this. I’ve asked, and the answer is mostly “no”. When I then probed what they thought made them a good teacher, the majority said “experience”. As far as I can tell “experience” is the stand-in term for “volume of training data for my subconscious processes in which I experiment with various approaches till I’ve hill-climbed to my local optimum in teaching performance”. Most can’t say what they do, why they do it, or pass on the process (obvious caveat: Some amazing teachers will of course be amazing teachers of teaching and teach all the teachers how to teach better. I suggest the backup ideal education system is to clone these particular teachers.)
Suffice it to say, introspection is hard. And devising scientific experiments that control for the myriad human factors that go in to teaching effectively is possibly even harder. So how about we skip all that and instead test our hypothesis by seeing if AI tutors get better based on what bits of teacher interactions they mimic. Thus I propose that the missing piece for the LES AI tutor is to train an AI model on video and audio material of world class tutors mentoring different types of students on various topics. The AI model will then learn how facial expressions, body language and non-linguistic speech markers relate to optimal prompts and interventions to keep the student focused and energized to learn.
So where are all the AI tutors now?
Well, Khan Academy has Khanmigo, DuoLingo has Lily, and Brainly tries to straight up be the homework tutor of your dreams.
Except, when you try them out, the question quickly presents itself: Why talk to them instead of Claude or ChatGPT?
One answer is that integration with the learning material is helpful—Khanmigo, Lily, and Brainly can directly access the material you are studying. That’s great for this transitional phase, but in two or three years, you might just get Claude or ChatGPT integrated in a Google Lens app, reading your screen, or watching your face through your camera.
Conclusion & Confusion
So what do we do now?
Well, a Learn Everything System (LES) run by an AI tutor that adapts the content to fully engage you in a game-based educational experience run as a rich simulation of all human knowledge seems to me to be the ideal form of learning—for probably almost everyone. But we are still missing some pieces, and the biggest of those is that an LLM would need to access the non-verbal component of human interaction so it can proactively keep a student engaged with the material.
On the other hand, we live in strange times and I’m not sure LES is possible before we develop AGI. Maybe we can create a subset of LES that is achievable today, without further progress in AI. Maybe the next right question to ask is what a lesser LES would look like. And maybe once we know that, we could—shall we say—turn that Less-on[5].
- ^
“les” means “lesson” in Dutch, my native language. It means “the” (plural) in French, which has a great all-the-things vibe. It means “them” in Spanish, which has a great all-the-people vibe.
- ^
“game-based” is distinctly different from “gamefied”! This deserves an essay in itself. But essentially, game-based learning is when you are playing for fun and you accidentally learn things without noticing. This happens to just about everyone who plays games, except most of it isn’t useful to them (“human transfer learning” is another essay I should write). In contrast, gamification is system designers reaching straight into your skull to pour dopamine into your exposed neural clefts.
- ^
For instance, last spring I went to the foremost education science conference in the Netherlands, ResearchED. They bring together researchers and educators to support and learn from each other. There I discovered two things:
Education is a massive coordination problem.
Good teachers know what works. Researchers don’t know why.
Case in point: there was a talk on “instructional scaffolding”, one of the seminal concepts in the field, by researchers from Utrecht University. Instructional scaffolding refers to adaptively adding and removing instructional support based on how quickly the student is progressing through the material. It was originally proposed by Wood, Bruner, & Ross in 1976. Google scholar shows over 18 thousand citations. Every pedagogical course under the sun recommends the practice. The original study had 32 participants and 1 instructor for all 4 conditions (different levels of scaffolding). The replication study had 285 participants, 8 instructors, and 4 conditions.
Much to the surprise of every teacher in the room, no effect size was found in the replication study. The paper isn’t published yet but during the presentation the researchers shared their methods: They had controlled for the exact level of scaffolding and wording, while filming every interaction so panel members could independently judge the quality of adherence to the research protocol.
They were as surprised as anyone that instructional scaffolding had no effect on student performance. Well, maybe not exactly as surprised as the teachers. The teachers were utterly baffled. Many spoke up to say that scaffolding worked amazingly well in their classes. How could this be? The researchers had no idea.
- ^
Technically LLMs currently also lack any way to offer you spaced repetition (Consolidation). However, this seems so trivially solvable that I’ve smoothly elided that part of the reasoning but somehow you are reading this footnote about it anyway.
- ^
Some say this entire essay was written as a lead up to this joke.
If you want to use LLM for a tutor, I think that is doable in theory, but you can’t just talk to ChatGPT and expect effective tutoring to happen. The problem is that an LLM can be anything, simulate any kind of a human, but you want it to simulate one very specific kind of a human—a good tutor. So at the very least, you need to provide a prompt that will turn the LLM into that specific kind of intelligence. As opposed to the alternatives.
Content—the same objection: the LLM knows everything, but it also knows all the misconceptions, crackpot ideas, conspiracy theories, etc. So in each lesson we should nudge it in the right direction: provide a list of facts, and a prompt that says to follow the list.
Navigation—provide a recommended outline. Unless the student wants to focus on something else, the LLM should follow a predetermined path.
Debugging—LLM should test student’s understanding very often. We could provide a list of common mistakes to watch out for. Also, we could provide specific questions that the student has to answer correctly, and tell the LLM to ask them at a convenient moment.
Consolidation—the LLM should be connected to some kind of space repetition system. Maybe the space repetition system would provide the list of things that the student should review today, and the LLM could chose the right way to ask about them, and provide a feedback to the space repetition system.
tl;dr—the LLM should follow a human-made (or at least human-approved) curriculum, and cooperate with a space repetition system.
Hi Shoshannah, thanks for the thoughtful article and thanks for the kind words about Brainly. We have a big vision and we’re working hard towards it.
As you contemplate this space I suggest reframing your problem/solution as a “teach everything system”. The acronym doesn’t have the same coincidental benefits as you noted for LES but I think it may more accurately describe your goals and ideas.
I say this because, learning and teaching are 2 separate activities and you’ll benefit from separating these concepts. In the optimum scenario teaching produces learning as an outcome but learning can also be self-directed, non-directional, chaotic, spontaneous, and have many other qualities that are not necessarily something we would recognize as a good tutoring/teaching experience. It seems to me that the question you’re trying to answer is, “what is the best teaching experience for a learner”.
This is a complex and nuanced space, as you’ve noted in your article, so (for the sake of brevity) I’ll leave my response at that and add—kudos for diving in and sharing your thoughts on this topic.
Thanks, Bill! I appreciate the reframe. I agree teaching and learning are two different activities. However, I think the end goal is that the user can learn whatever they need to learn, in whatever way they can learn it. As such, the learner activity is more central than the teaching activity—Having an ideal learning activity will result in the thing we care about (-> learning). Having the ideal teaching experience may still fall flat if the connection with the learner is somehow not made.
I’m curious what benefits you notice from applying the reframe to focusing on the teaching activity first. Possibly more levers to pull on as it’s the only side of the equation we can offer someone from the outside?
The reframe is meant to fit the solution you’ve described and your supporting arguments so that there is clarity on what you’re trying to accomplish and subsequent discussion and iteration can be understood in that reframed context.
I say this because I believe that the definition of learning is much simpler yet much broader than what you’ve described here. For example,
Does not hold true if you were to hold it up to the representation of learning that we base much of our work off of at Brainly. Our definition is very simple—learning is the process of connecting information not known, to information that is known. We can present the same information to many different individuals and get many different “things” learned based on what they already know.
However, it does hold true when we think about applying a delivery of information for the learner with a specific goal in mind for what that learner should learn. We call that teaching and it requires having clarity on outcomes so they can be assessed and gaps in the learner’s knowledge filled in to ensure the goal is met.
At the end of the day, I am probably being a bit too philosophical about this for a comments section but I hope this perspective is helpful in some way in shaping your own views about the topic.
Thank you for the clarification!
I think I agree this might be more a matter of semantics than underlying world model. Specifically:
Bill.learning = “process of connecting information not known, to information that is known”
Shoshannah.learning = “model [...] consisting of 6 factors—Content, Knowledge Representation, Navigation, Debugging, Emotional Regulation, and Consolidation.” (note, I’m considering a 7th factor at the moment: which is transfer learning. This factor may actually bridge are two models.)
Bill.teaching = “applying a delivery of information for the learner with a specific goal in mind for what that learner should learn”
Shoshannah.teaching = [undefined so far], but actually “Another human facilitating steps in the learning process of a given human”
---
With those as our word-concept mappings, I’m mostly wondering what “learning” bottoms out to in your model? Like, how does one learn?
One way to conceptualize my model is as:
Data → encoding → mapping → solution search → attention regulation → training runs
And the additional factor would be “transfer learning” or I guess fine-tuning (yourself) by noticing how what you learn applies to other areas as well.
And a teacher would facilitate this process by stepping in an providing content/support/debugging for each step that needs it.
I’m not sure why you are conceptualizing the learning goal as being part of the teacher and not the learner? I think they both hold goals, and I think learning can happen goal-driven or ‘free’, which I think is analoguous with the “play” versus “game” distinction in ludology—and slightly less tightly analoguous to exploration versus exploitation behavior.
I’m curious if you agree with the above.
Seems to me that a good system to teach everything needs to have three main functions. The existing solutions I know about only have one or two.
First, a software platform that allows you to do all the things you might want to do: show text, show pictures, show videos, download files, interactive visualizations, tests (of various kinds: multiple choice, enter a number, arrange things into pairs or groups...).
Here, the design problem is that the more universal the platform is, the more complicated it is to let a non-tech user use its capabilities. For an experienced programmer you just need to say “upload the HTML code and other related files here”, and the programmer will then be able to write text, show pictures, show videos, and include some JavaScript code for animation and testing. (Basically: SCORM, mostly known as Moodle.)
The obvious problem is that most teachers are not coders. So they would benefit from having some wizard that allows them to choose from some predefined templates, and then e.g. if they choose a template “read some text”, they would be given an option to write the text directly in a web editor, or upload an existing Word document. But ideally, you would need to provide some real-world support (which is expensive and does not scale), for example I imagine that many good teachers would have a problem with recording a video, editing it, and uploading the file.
Second, it is not enough to create a platform, because then you have a chicken-and-egg problem: the students won’t come because there is nothing to learn, and the teachers won’t come because there are no students. So in addition to building the platform, you would also need to provide a nontrivial amount of some initial content. There is a risk that if the initial content sucks, people will conclude that your platform sucks. On the other hand, if your initial content is good, people will first come to learn, then some teachers will recommend the content to their students, and only then some teachers will be like “oh, I can also make my own tests? and my own lessons?”
Third, when people start creating things, on one hand this is what you want, on the other hand, most people are stupid and they produce shit. So the average quality will dramatically drop. But if you set some minimum quality threshold, it may discourage users. Some people produce shit first, and gradually they get better. So what you need instead is some recommendation system, where the platform can handle a lot of shit without that shit being visible and making the average experience worse.
For example, anyone can create their own lesson, but by default the only way to access the lesson is via its URL. So the authors can send links to their lessons by e-mail or by social networks. At some moment, the lessons may get verified, which means that someone independent will confirm that the lesson is more good than bad. (It does not violate the terms of service, and it says true things.) Then verified lessons could then be found by entering the keywords on the platform’s main page. Also, users could create their own lists of lessons (their own, or other people’s lessons) and share those lists via their URL. For example, a math teacher would not need to create their own lessons for everything, but could instead look at the existing materials, choose the best ones, and send a list of those to their students. Finally, the best lessons would be picked by staff and recommended as the platform’s official curriculum—that is what everyone would see by default on the main page.
Relevant: https://andymatuschak.org/hmwl/
I have another architecture in mind, actually (but I think specific choice is only a means towards the end).
The tutor must be able to infer how the student arrives at their conclusion for a few demo problems, that being “student’s problem-solving structure[1]”. Structure is then corrected towards valid for the problem (it might be helpful to demonstrate examples when structures yield different answers or take different time to complete). Though, that may run into an issue, making people less skeptic:
I say “structure” instead of “algorithm” because it often fails to solve the given problem, is very much non-deterministic, and also its intermediate nodes (lemmas, conjectures, estimations what path is closer to a cleanly-looking solution) are useful