Hard problem of alignment is going to hit us like a train in 3 to 12 months at the same time some specific capabilities breakthroughs people have been working on for the entire history of ML finally start working now that they have a weak AGI to apply to, and suddenly critch’s stuff becomes super duper important to understand.
Hard problem of alignment is going to hit us like a train in 3 to 12 months at the same time some specific capabilities breakthroughs people have been working on for the entire history of ML finally start working now that they have a weak AGI to apply to, and suddenly critch’s stuff becomes super duper important to understand.
What Critch stuff do you have in mind?
Modal Fixpoint Cooperation without Löb’s Theorem
Löbian emotional processing of emergent cooperation: an example
«BOUNDARIES» SEQUENCE
Really, just skim his work. He’s been thinking well about the hard problems of alignment for a while.