Agreed: and if this proceeds on the timelines I’m currently expecting, I’m looking forward to discussing all this with AGIs smarter than me, perhaps later this decade.
Quite possibly, some small number of groups will separately create semi-aligned AGIs with different alignment approaches and somewhat different definitions of alignment. I’m hoping the resulting conflict is a vigorous intellectual debate informed by experimental results, not a war.
I share that hope, but I want to do as much as I can now to ensure that outcome. Highly convincing arguments that an approach leads with high likelihood to catastrophic war might actually make people take a different approach. If such arguments exist, I want to find them and spread them ASAP. I see no reason to believe such arguments don’t exist. Even decent arguments for the risks might steer people away from them or generate solutions faster.
Agreed: and if this proceeds on the timelines I’m currently expecting, I’m looking forward to discussing all this with AGIs smarter than me, perhaps later this decade.
Quite possibly, some small number of groups will separately create semi-aligned AGIs with different alignment approaches and somewhat different definitions of alignment. I’m hoping the resulting conflict is a vigorous intellectual debate informed by experimental results, not a war.
I share that hope, but I want to do as much as I can now to ensure that outcome. Highly convincing arguments that an approach leads with high likelihood to catastrophic war might actually make people take a different approach. If such arguments exist, I want to find them and spread them ASAP. I see no reason to believe such arguments don’t exist. Even decent arguments for the risks might steer people away from them or generate solutions faster.
More specifics on the other thread.