Thanks. Sure, I’m always happy to update on new arguments and evidence. The most likely way I see possibly updating is to realize the gap between current AIs and human intelligence is actually much larger than it currently seems, e.g. 50+ years as Robin seems to think. Then AI alignment research has a larger chance of working.
I also might lower P(doom) if international govs start treating this like the emergency it is and do their best to coordinate to pause. Though unfortunately even that probably only buys a few years of time.
Finally I can imagine somehow updating that alignment is easier than it seems, or less of a problem to begin with. But the fact that all the arguments I’ve heard on that front seem very weak and misguided to me, makes that unlikely.
I think it would be very interesting to see you and @TurnTrout debate with the same depth, preparation, and clarity that you brought to the debate with Robin Hanson.
Edit: Also, tentatively, @Rohin Shah because I find this point he’s written about quite cruxy.
My position is “goal-directedness is an attractor state that is incredibly dangerous and uncontrollable if it’s somewhat beyond human-level in the near future”.
The form of those arguments seems to be like “technically it doesn’t have to be”. But realistically it will be lol. Not sure how much more there will be to say.
Thanks. Sure, I’m always happy to update on new arguments and evidence. The most likely way I see possibly updating is to realize the gap between current AIs and human intelligence is actually much larger than it currently seems, e.g. 50+ years as Robin seems to think. Then AI alignment research has a larger chance of working.
I also might lower P(doom) if international govs start treating this like the emergency it is and do their best to coordinate to pause. Though unfortunately even that probably only buys a few years of time.
Finally I can imagine somehow updating that alignment is easier than it seems, or less of a problem to begin with. But the fact that all the arguments I’ve heard on that front seem very weak and misguided to me, makes that unlikely.
I think it would be very interesting to see you and @TurnTrout debate with the same depth, preparation, and clarity that you brought to the debate with Robin Hanson.
Edit: Also, tentatively, @Rohin Shah because I find this point he’s written about quite cruxy.
I’m happy to have that kind of debate.
My position is “goal-directedness is an attractor state that is incredibly dangerous and uncontrollable if it’s somewhat beyond human-level in the near future”.
The form of those arguments seems to be like “technically it doesn’t have to be”. But realistically it will be lol. Not sure how much more there will be to say.