Thank you for the links Adam. To clarify, the kind of argument I’m really looking for is something like the following three (hypothetical) examples.
Mesa-optimization is the primary threat model of unaligned AGI systems. Over the next few decades there will be a lot of companies building ML systems that create mesa-optimizers. I think it is within 5 years of current progress that we will understand how ML systems create mesa-optimizers and how to stop it.Therefore I think the current field is adequate for the problem (80%).
When I look at the research we’re outputting, it seems to me to me that we are producing research at a speed and flexibility faster than any comparably sized academic department globally, or the ML industry, and so I am much more hopeful that we’re able to solve our difficult problem before the industry builds an unaligned AGI. I give it a 25% probability, which I suspect is much higher than Eliezer’s.
I basically agree the alignment problem is hard and unlikely to be solved, but I don’t think we have any alternative than the current sorts of work being done, which is a combo of (a) agent foundations work (b) designing theoretical training algorithms (like Paul is) or (c) directly aligning narrowly super intelligent models. I am pretty open to Eliezer’s claim that we will fail but I see no alternative plan to pour resources into.
Whatever you actually think about the field and how it will save the world, say it!
It seems to me that almost all of your the arguments you’ve made work whether the field is a failure or not. The debate here has to pass through whether the field is on-track or not, and we must not sidestep that conversation.
I want to leave this paragraph as social acknowledgment that you mentioned upthread that you’re tired and taking a break, and I want to give you a bunch of social space to not return to this thread for however long you need to take! Slow comments are often the best.
I’m glad that I posted my inflammatory comment, if only because exchanging with you and Rob made me actually consider the question of “what is our story to success”, instead of just “are we making progress/creating valuable knowledge”. And the way you two have been casting it is way less aversive to me that the way EY tends to frame it. This is definitely something I want to think more about. :)
I want to leave this paragraph as social acknowledgment that you mentioned upthread that you’re tired and taking a break, and I want to give you a bunch of social space to not return to this thread for however long you need to take! Slow comments are often the best.
Thank you for the links Adam. To clarify, the kind of argument I’m really looking for is something like the following three (hypothetical) examples.
Mesa-optimization is the primary threat model of unaligned AGI systems. Over the next few decades there will be a lot of companies building ML systems that create mesa-optimizers. I think it is within 5 years of current progress that we will understand how ML systems create mesa-optimizers and how to stop it.Therefore I think the current field is adequate for the problem (80%).
When I look at the research we’re outputting, it seems to me to me that we are producing research at a speed and flexibility faster than any comparably sized academic department globally, or the ML industry, and so I am much more hopeful that we’re able to solve our difficult problem before the industry builds an unaligned AGI. I give it a 25% probability, which I suspect is much higher than Eliezer’s.
I basically agree the alignment problem is hard and unlikely to be solved, but I don’t think we have any alternative than the current sorts of work being done, which is a combo of (a) agent foundations work (b) designing theoretical training algorithms (like Paul is) or (c) directly aligning narrowly super intelligent models. I am pretty open to Eliezer’s claim that we will fail but I see no alternative plan to pour resources into.
Whatever you actually think about the field and how it will save the world, say it!
It seems to me that almost all of your the arguments you’ve made work whether the field is a failure or not. The debate here has to pass through whether the field is on-track or not, and we must not sidestep that conversation.
I want to leave this paragraph as social acknowledgment that you mentioned upthread that you’re tired and taking a break, and I want to give you a bunch of social space to not return to this thread for however long you need to take! Slow comments are often the best.
Thanks for the examples, that helps a lot.
I’m glad that I posted my inflammatory comment, if only because exchanging with you and Rob made me actually consider the question of “what is our story to success”, instead of just “are we making progress/creating valuable knowledge”. And the way you two have been casting it is way less aversive to me that the way EY tends to frame it. This is definitely something I want to think more about. :)
Appreciated. ;)
Glad to hear. And yeah, that’s the crux of the issue for me.
! Yay! That’s really great to hear. :)