I’m not sure what more can be said about “AGIs that cause worse-than-extinction outcomes are clustered around FAIs in design space”. It’s obvious, isn’t it?
I guess I could write about some FAI approaches being more likely to cause worse-than-extinction outcomes than others. For example, FAIs that are closely related to uploading or try to automatically extract values from humans seem riskier in this regard than FAIs where the values are coded directly and manually. But this also seems obvious and I’m not sure what I can usefully say beyond a couple of sentences.
FWIW, that superhuman environment-optimizers (e.g. AGIs) that obtain their target values from humans using an automatic process (e.g., uploading or extraction) are more likely to cause worse-than-extinction outcomes than those using a manual process (e.g. coding) is not obvious to me.
I’m not sure what more can be said about “AGIs that cause worse-than-extinction outcomes are clustered around FAIs in design space”. It’s obvious, isn’t it?
I guess I could write about some FAI approaches being more likely to cause worse-than-extinction outcomes than others. For example, FAIs that are closely related to uploading or try to automatically extract values from humans seem riskier in this regard than FAIs where the values are coded directly and manually. But this also seems obvious and I’m not sure what I can usefully say beyond a couple of sentences.
FWIW, that superhuman environment-optimizers (e.g. AGIs) that obtain their target values from humans using an automatic process (e.g., uploading or extraction) are more likely to cause worse-than-extinction outcomes than those using a manual process (e.g. coding) is not obvious to me.