They should not have been counting pull notifications, as they were instructed to not engage with their phones during the experiment except to maybe see what caused a vibration or ding. I don’t think students think of pull notifications as real notifications the way we were using the word. They were logging the notifications they could notice while their phone flat was flat on their desk not being touched.
tanagrabeast
No. Everyone seemed to know what they were, because they all claimed to know someone who uses them. But I don’t recall anyone ever admitting to being such a someone. I sense there’s a bit of a stigma around them.
It is credible that eliminating all preventable distractions (phones, earbuds, etc.) wouldn’t improve learning much. As a teen, I bet you were distracted during class by all sorts of things contained entirely within your head. I know I was!
There’s a somewhat stronger case that video games and social media have given students more things to be preoccupied about even if you make these things inaccessible during class. But I also think that just being a hormonal teen is often distracting enough to fill in any attention vacancies faster than the median lesson can.
This is important work.
One suggested tweak: I notice this document starts leaning on the term “loss” in section 4.2 but doesn’t tell the reader what that means in this context until 4.3
Something similar happens with the concept of “weights”, first used in section 1.3, but only sort-of-explained later, in 4.2.
Speaking of weights, I notice myself genuinely confused in section 5.2, and I’m not sure if it’s a problem with the wording or with my current mental model (which is only semi-technical). A quoted forecast reads:
“GPT-2030’s copies can share knowledge due to having identical model weights, allowing for rapid parallel learning: I estimate 2,500 human-equivalent years of learning in 1 day.”
Wouldn’t the model doing the sharing have, by definition, different weights than the recipient? (How is a model’s “knowledge” stored if not in the weights? ) My best guess: shareable “knowledge” would take the form of vectors over the models’ common foundational base weights—which should work as long as there hasn’t been too much other divergence since the fork. Is that right? And if so, is there some reason this is a forecast capability and not a current one?
My apologies for challenging the premise, but I don’t understand how anyone could hope to be “convinced” that humanity isn’t doomed by AGI unless they’re in possession of a provably safe design that they have high confidence of being able to implement ahead of any rivals.
Put aside all of the assumptions you think the pessimists are making and simply ask whether humanity knows how to make a mind that will share our values. It it does, please tell us how. If it doesn’t, then accept that any AGI we make is, by default, alien—and building an AGI is like opening a random portal to invite an alien mind to come play with us.
What is your prior for alien intelligence playing nice with humanity—or for humanity being able to defeat it? I don’t think it’s wrong to say we’re not automatically doomed. But let’s suppose we open a portal and it turns out ok: We share tea and cookies with the alien, or we blow its brains out. Whatever. What’s to stop humanity from rolling the dice on another random portal? And another? Unless we just happen to stumble on a friendly alien that will also prevent all new portals, we should expect to eventually summon something we can’t handle.
Feel free to place wagers on whether humanity can figure out alignment before getting a bad roll. You might decide you like your odds! But don’t confuse a wager with a solution.
This is about where I’m at, as well. I’ve been wrestling with the idea of starting a run myself, but one of my qualifying traits (I teach creative writing) also means I work full time and have little hope of beating out ten people who don’t. So much the better, I say, so long as the work gets done well and gets done soon...
...but if, eight months from now, much of the budget is still on the table because of quality issues, it may be because people me sat on our hands.
Hopefully, someone will emerge early to work around this issue, if it turns out to be one. I, for one, would love to be able to turn in a sample and then be offered a credible good-faith assurance that if my run is completed at same quality by such and such date, a payment of x will be earned. But as it stands, the deadline is “whenever that fastest mover(s) get there”. Who knows when that will be? Any emergent executive candidate making me a deal might be made a liar by a rival who beats them to the jackpot.
My questions are mostly about the player side, and about how deeply the DM should model the player:
Should the player be assumed to be implicitly collaborating towards a coherent, meaningful narrative, as is necessary for a long-lived TTRPG? Or might they be the type of player you often find in AI Dungeon who tries to murder and/or have sex with everything in sight?
Should players ever try to steer the story in a genre-breaking direction, like erotica or murder-hobo power fantasy? Should DMs resist these efforts or play along? If the latter, should the DM go a step further to actively intuit what this particular player would like to see happen?
Should players provide input that might be more sweeping than usable in narrative? (e.g. Take over the world!) If so, on what sort of level should the DM respond to these?
Should players be assumed to be ready to end the narrative at the ~1,000-step point?
I don’t see as much disagreement between us as you might be thinking. Precisely because I agree with your numbered points 1 and 2, I suggested it could be beneficial to compress most of our 12 years of math instruction down to a more intensive 2-3 years. That doesn’t mean we couldn’t instill useful basic arithmetic in lower grades. If we chose a smaller set of core basics, it could be quite practical to retain them over long summers and breaks—at least for the students who stay in our system for the long haul.
I’m also glad you brought up the fact that spaced repetition doesn’t have to involve software. I should have done more to remind readers of this. I weave the spacing and testing effects into the fabric of my course in many ways that have nothing to do with software.
Carefully engineered homework assignments are great if you have motivated students. Take-home SRS could even work for that. Those students are usually fine, though. It’s the apathetic middle I have to fight for, and they won’t do homework regardless of how I try to incentivize it.
Moreover, I don’t feel good about assigning to students who would hate to do it. School is already prison for those kids. I don’t want to send prison home with them. As both a child and a parent, I have been too familiar with the toxic effects homework—especially math homework—can have on family relationships. Let kids have a light at the end of the daily tunnel, I say.
Is homework vital to a successful math program? I don’t know. But I’m glad I don’t teach math.
Did you get IRB approval for these human studies on children?
I’m not sure which is more absurd: the IRB approval process or the very idea of high school. I’ve often asked people to consider a thought experiment where everyone on Earth suddenly forgets that our educational system as we know it ever existed. Would we really reinvent it just like it is now? Hearing how it worked, would we scream in terror and cancel anyone who had taken part? (Status quo bias much?)When I was studying stand-up comedy, I actually developed a bit in which I play-acted a researcher proposing high school to an ethics board. It went like this:
RESEARCHER: “I was thinking we could stick 35 sleep-deprived teenagers in a room for an hour and expose them to academic stimuli. After that, we’ll do some tests on them.”
BOARD: “I see. Tell me more about your subjects.”
RESEARCHER: “Well, they’re minors, obviously.”
BOARD: “Okay…”
RESEARCHER: “And most of them will be enrolled against their will.”
BOARD: “And how long will you need them?”
RESEARCHER: “6 sessions a day for four years.”
BOARD: “Wait, hold on. Sample size? How many kids are we talking about, here?”
RESEARCHER: “All of them.”
BOARD: (mutterings among themselves) “Well, it sounds like everything is in order...”
Are you familiar with Direct Instruction, which is reminiscent of the Mennonite school?
Someone (probably on LW) pointed me to Direct Instruction a few years back, so yes, I’m acquainted with it. Because of the emphasis on staying fully reviewed on all relevant prior knowledge, I saw it as having obvious promise for technical subjects like math, in the hands of the right teacher. I was less convinced it made a good fit elsewhere, perceiving (perhaps unfairly—I didn’t dig too deeply) some big negative trade-offs:
Like with my whole-class Anki, it seems heavily reliant on the teacher’s high-energy snake-charmer charisma. This makes it difficult to sustain for much of a class period and demands a great deal from a teacher who tries to do it all day long, day after day. This also makes it difficult to broadly among teachers with different personalities.
It sounds brittle with regards to roster variance. Specifically, it seems pretty insistent on having everyone in the room up to speed. With careful tracking/grouping of students, this can be achieved, but in practice, kids move in to your school part way through the year and aren’t on the same page. Or you only have the one or two teachers for that grade level math, so the slowest kids are in the same boat as the sharpest. I would think that one or two stragglers would grind the class to a halt, and that this would be statistically inevitable in larger classes. (I don’t know if this makes DI math worse than the status quo, where plenty of students are fall behind and get lost, but with less fanfare and hold-up for everyone else.)
Have you ever tried SRS for muscle memory?
No. I’m not seeing how that would work, or how that would be relevant to what I do, but I’m certainly curious. Do you have examples?
Do you think that instinctive drive to listen to experts “talk shop” applies to apathetic students, though?
That’s definitely the right question. If you and another expert leap straight into fluent French, no, I don’t think your apathetic students will try to keep up—especially if they are early beginners. More helpful might be a Franc-lish hybrid conversation where you swap stories of embarrassing errors and insights largely in English while sprinkling in French words and expressions, reenacting parts of colorful encounters from your combined French-speaking experience.
I also think one of the difficulties in modeling language fluency is that the whole point of being fluent is to not need to think about the language, but to simply think in it, so I’m not sure what your vocalized monologue would be about...unless...
Ok, here’s a thought: I and the other motivated folks I learned Spanish with sometimes found ourselves slipping into a Spanglish patois outside of class where we spoke English with Spanish syntax. It felt like silly play at the time, but I now think it was an instinctive intermediate step to thinking in that language.
“It makes rain.” (It’s raining.)
“To me pleases the rain!” (I like rain.)
Perhaps you could try fostering a Franc-lish dialect in your classes by thinking out loud in that style and inviting others to join you in banter, patiently nudging them to get the grammar right instead of just talking like Yoda. From there, substituting actual French with increasing frequency could feel very natural.
You may not have immersive environments, but I imagine you’ll be creating simulated immersion: play-acting situations that give you a chance to think out loud as though you are navigating the moment for real. (Example: Going to the produce section of the store and seeing what looks good, what you could make with it, etc.) How much of that you should do in English, Frank-lish syntactic patois, or French will probably be something you will develop an expert instinct for as you become skilled at reading the room. Along the way, developing an entertaining stage presence for this play-acting would give you a powerful weapon against apathy.
Yes, yes… and you would be randomly involving students in your little improvised plays, assigning them roles, keeping them on their toes, making the non-participants want to get called on.
Yep, it sounds pretty awesome from the comfort of my not-having-to-teach-French perch :)
Oh, wow. Yes. That. Looks like there’s another book I don’t need to write.
The fact that the concept was so fleshed out thirty years ago kind of pisses me off. My teacher training was so the opposite of that (a bunch of student group work nonsense). And I’m not finding apprenticeship familiar to new teachers currently, though strong veterans often seem to have at least a half-baked version they’ve derived from experience. I get a lot of wide-eyed “Yes!” when I share it with them.
Experts talking shop with other experts is one of my favorite finds when I study!
During my dive into stand-up comedy, I came across this video of some top comedians talking shop. Especially from about the 30 minute mark, when they seem less concerned with entertaining their audience, they get into some juicy minutiae of why a joke might work or not. It really expanded my thinking on the subject.
Are such chats more insightful than an expert teacher would be in a lesson on that same topic? Not necessarily. But you might not find a skilled teacher ever teaching a lesson on that exact topic. I think humans are naturally primed to closely observe expert-expert chats for a few reasons:
• Social proof. We instinctively want to be able to talk like the experts do so we can blend in with them. So we listen carefully to how they talk.
• Authenticity. If this is what experts actually talk about, we feel like it must really matter. It’s not just the lesson of the day.
• The overhearing effect. This is a term I’m making up, but I’ve found it to be an important one exploited by storytellers. We naturally want to deduce the context of overheard language, so we listen extra carefully, trying to fill in the blanks. I suspect this is down to humans’ highly evolved appetite for gossip. The fact that the experts aren’t talking to us is essential for exploiting this effect.
Although… I find that an expert talking to himself, seemingly unguarded, seemingly without conscious awareness that he is being overhead… can also trigger the overhearing effect. When I model a skill to my students, I try to verbalize my inner monologue in a way that will be intriguing to overhear and carry that essential whiff of authenticity.
I’m not sure what expert self-talk looks like in foreign language instruction, but I would be interested to find out. (Any ideas?)
But from my time becoming a reasonably fluent Spanish speaker (since lost), I can describe a few language dimensions I found interesting but neglected by all but the nerdiest supplemental books.
Sentence-level inflection patterns vary, and it helps to be aware of them. For instance, the musicality of typical question sentence is different in American English than in Castilian Spanish. If you can pick up on the melody earlier in the sentence, you can better contextualize what is being
saidasked.The way speakers in different languages produce what seems, on the surface, to be identical phonemes, can be quite different, and understanding this is essential to actually sounding like a native. There can be hours of fun trying to practice a Castilian ‘toh’ sound (as in toma), with its thicker top-front palette tongue contact, vs. the American English cousin equivalent (as in tomato).
Native speakers of language A learning language B often end up predictably adopting many of the same idioms and juicy words from language B into their language A conversations with each other, and they find themselves saying or thinking in those patterns even when their brains are mostly running language A. It could be fun to introduce some of these to novices and make it part of the language A classroom slang—a kind of introduction to thinking in language B.
[10] If you’re a fellow teacher, you know that this is the differentiation problem solving itself.
[9] Do you want to know what I’ve hated most about teaching in person during the Covid-19 pandemic? The way mutual mask-wearing scrams my reactor. With my facial expressions concealed, my deliveries don’t land as consistently. With the students’ expressions concealed, I am deprived of the energy I would gain by getting a reaction out of them. The parts of the job that used to recharge me drain me instead. I don’t have words to describe how awful this feels.
[8] I remember the first time I appreciated this skill. It was when I saw this hilarious exchange between Louis CK and Conan O’Brien, and then saw the same content later as a bit in one of his shows (4:39). It seems embarrassing to have not seen it, but it hadn’t occurred to me that talk-show ‘interviews’ with comedians might sometimes be adaptations of their bits. Seriously, though, Louis CK really comes across as a spontaneously funny guy in that first clip. He elevates the convincingness of spontaneity into another layer of comedic art.
[7] She goes by many names around the world. In the UK, teachers swap scary stories about Bore-a-trix Lestrange, Lady Macbarf, and Nary, Queen of Nots.
[6] When it’s releasing more energy than you’re using to contain it.
[5] This book would be somewhat redundant in a world where we already have David Didau’s What if everything you knew about education was wrong? I crossed paths with this title during a pensive season of my life and appreciated the way it asked questions from first principles, challenging orthodox assumptions without jumping to new conclusions. In particular, Didau had the words to express what I was feeling about forgetting.
[4] Consider how a serial television show uses a “Previously, on [title]” to remind you of plot threads that are going to be relevant to this episode, some of which might be from several episodes back. This is superior to how they used to do it, which was “Last time, on [this show].” The primitive form would fail to remind you of relevant threads from older episodes and needlessly remind you of irrelevant threads from last week. When you review with your students, are you just reviewing the most recent stuff, or are you choosing the stuff that’s about to be relevant again?
From chatting with those peak students during the experiment, I think their experience is more like being in a cafeteria abuzz with the voices of friends and acquaintances. At some point, you’re not even trying to follow every conversation, but are just maintaining some vague awareness of the conversations that are taking place and jumping in when you feel like it. People can and do think about other things in a noisy cafeteria. Some even read books! The brain can filter out a constant buzz. It’s just wind blowing through the trees.
The upper middle zone where it’s still possible to try to follow everything (and maybe even reply) looked like more of an attention trap, and was where I was more likely to find that handful of students I already knew had a problem. The FOMO is probably more distracting than the notifications themselves.