Thanks, that makes sense, thinking about YKs writings. I’ll add that briefly to the piece. What’s the best reference, if you have a moment?
I think LLMs are already mostly finding natural abstractions. They’ll have some weird cross-talk, like the golden gate bridge being mixed with fog, but humans have that too, to maybe a lesser degree, and we can still communicate pretty well about abstractions, at least if we’re careful.
I’m glad you liked the crux list. I think it’s really important to keep asking ourselves why others have different takes. The topic is too important to do the sta dard thing and just say “well they don’t get it”.
Eliezer’s List O’Doom probably has a short statement in there somewhere, if you want a quote on his position. Much of his back-and-forth with Quintin is also about rejecting natural abstraction, but I don’t know of a short pithy summary in that corpus. (More generally, it’s pretty clear from my standpoint that there are basically two cruxes between Eliezer and Quintin, because my own models look mostly like Eliezer’s if I flip the natural abstraction bit and mostly like Quintin’s if I flip a particular bit having to do with ease of outer alignment.)
If you want a reference on the natural abstraction hypothesis more generally, I introduced the term in Alignment By Default.
Thanks, that makes sense, thinking about YKs writings. I’ll add that briefly to the piece. What’s the best reference, if you have a moment?
I think LLMs are already mostly finding natural abstractions. They’ll have some weird cross-talk, like the golden gate bridge being mixed with fog, but humans have that too, to maybe a lesser degree, and we can still communicate pretty well about abstractions, at least if we’re careful.
I’m glad you liked the crux list. I think it’s really important to keep asking ourselves why others have different takes. The topic is too important to do the sta dard thing and just say “well they don’t get it”.
Eliezer’s List O’Doom probably has a short statement in there somewhere, if you want a quote on his position. Much of his back-and-forth with Quintin is also about rejecting natural abstraction, but I don’t know of a short pithy summary in that corpus. (More generally, it’s pretty clear from my standpoint that there are basically two cruxes between Eliezer and Quintin, because my own models look mostly like Eliezer’s if I flip the natural abstraction bit and mostly like Quintin’s if I flip a particular bit having to do with ease of outer alignment.)
If you want a reference on the natural abstraction hypothesis more generally, I introduced the term in Alignment By Default.