Starting with the assumption of utilitarianism, I believe you’re correct. I think the folks working on this stuff assign a low probability to “kill all humans” being Friendly. But I’m pretty sure people aren’t supposed to speculate about the output of CEV.
But perhaps the proportion of FAIs that ‘kill all humans’ is large.
Maybe probability you estimate for that to happen is high, but “proportion” doesn’t makes sense, since FAI is defined as an agent acting for specific preference, so FAIs have to agree on what to do.
Starting with the assumption of utilitarianism, I believe you’re correct. I think the folks working on this stuff assign a low probability to “kill all humans” being Friendly. But I’m pretty sure people aren’t supposed to speculate about the output of CEV.
Probably the proportion of ‘kill all humans’ AIs that are friendly is low. But perhaps the proportion of FAIs that ‘kill all humans’ is large.
That depends on your definition of Friendly, which in turn depends on your values.
Maybe probability you estimate for that to happen is high, but “proportion” doesn’t makes sense, since FAI is defined as an agent acting for specific preference, so FAIs have to agree on what to do.
OK, I’m new to this.