This post is very frightening to me in that I had not viscerally understood just how far off we are from solving FAI. Are we really (probably) in the area of before solving the problem, invent Calculus? Is that Hyperbole? Or worse? Also, the post also gives me hope that the problem of FAI is solvable (hopefully tractably!). When i was at first shown the problems that arise from just trying something, it felt like no one had any attack whatsoever. Now i think i understand better that no one really knows anything, but they do have some sort of an attack on it. I’d feel pretty safe if one could formalise a solution in math without paradox.
This might constitute an artists complaint, but since i believe your goal is to effectively persuade/argue/show i think that the post suffered from stewing on the exact same material all the way through. Maybe contrast Beth/Alonso with a Dunce/Alonso interaction? Show how immediatly jumping on a solution is rather silly directly, not only indirectly?
Anyways, my sympathies if conversations like this are part of your daily (weekly) work. It hurt just to read.
I think Eliezer’s goal was mainly to illustrate the kind of difficulty FAI is, rather than the size of the difficulty. But they aren’t totally unrelated; basic conceptual progress and coming up with new formal approaches often requires a fair amount of serial time (especially where one insight is needed before you can even start working toward a second insight), and progress is often sporadic compared to more applied/well-understood technical goals.
It would usually be extremely tough to estimate how much work was left if you were actually in the “rocket alignment” hypothetical—e.g., to tell with confidence whether you were 4 years or 20 years away from solving “logical undiscreteness”. In the real world, similarly, I don’t think anyone knows how hard the AI alignment problem is. If we can change the character of the problem from “we’re confused about how to do this in principle” to “we fundamentally get how one could align an AGI in the real world, but we haven’t found code solutions for all the snags that come with implementation”, then it would be much less weird to me if you could predict how much work was still left.
Ouch.
This post is very frightening to me in that I had not viscerally understood just how far off we are from solving FAI. Are we really (probably) in the area of before solving the problem, invent Calculus? Is that Hyperbole? Or worse? Also, the post also gives me hope that the problem of FAI is solvable (hopefully tractably!). When i was at first shown the problems that arise from just trying something, it felt like no one had any attack whatsoever. Now i think i understand better that no one really knows anything, but they do have some sort of an attack on it. I’d feel pretty safe if one could formalise a solution in math without paradox.
This might constitute an artists complaint, but since i believe your goal is to effectively persuade/argue/show i think that the post suffered from stewing on the exact same material all the way through. Maybe contrast Beth/Alonso with a Dunce/Alonso interaction? Show how immediatly jumping on a solution is rather silly directly, not only indirectly?
Anyways, my sympathies if conversations like this are part of your daily (weekly) work. It hurt just to read.
I think Eliezer’s goal was mainly to illustrate the kind of difficulty FAI is, rather than the size of the difficulty. But they aren’t totally unrelated; basic conceptual progress and coming up with new formal approaches often requires a fair amount of serial time (especially where one insight is needed before you can even start working toward a second insight), and progress is often sporadic compared to more applied/well-understood technical goals.
It would usually be extremely tough to estimate how much work was left if you were actually in the “rocket alignment” hypothetical—e.g., to tell with confidence whether you were 4 years or 20 years away from solving “logical undiscreteness”. In the real world, similarly, I don’t think anyone knows how hard the AI alignment problem is. If we can change the character of the problem from “we’re confused about how to do this in principle” to “we fundamentally get how one could align an AGI in the real world, but we haven’t found code solutions for all the snags that come with implementation”, then it would be much less weird to me if you could predict how much work was still left.