Let’s say we decided that we’d mostly given up on fully aligning AGI, and had decided to find a lower bound for the value of the future universe give that someone would create it. Let’s also assume this lower bound was something like “Here we have a human in a high-valence state. Just tile the universe with copies of this volume (where the human resides) from this point in time to this other point in time.” I understand that this is not a satisfactory solution, but bear with me.
How much easier would the problem become? It seems easier than a pivotal-act AGI.
Things I know that will still make this hard:
Inner alignment
Ontological crises
Wireheading
Things we don’t have to solve:
Corrigibility
Low impact (although if we had a solution for low impact, we might just try to tack it on to the resulting agent and and find out whether it works)
You may get massive s-risk at comparatively little potential benefit with this. On many people’s values, the future you describe may not be particularly good anyway, and there’s an increased risk of something going wrong because you’d be trying a desperate effort with something you’d not fully understand.
Ah, I forgot to add that this is a potential s-risk. Yeah.
Although I disagree that that future would be close to zero. My values tell me it would be at least a millionth as good as the optimal future, and at least a million times more valuable than a completely consciousness-less universe.
Let’s say we decided that we’d mostly given up on fully aligning AGI, and had decided to find a lower bound for the value of the future universe give that someone would create it. Let’s also assume this lower bound was something like “Here we have a human in a high-valence state. Just tile the universe with copies of this volume (where the human resides) from this point in time to this other point in time.” I understand that this is not a satisfactory solution, but bear with me.
How much easier would the problem become? It seems easier than a pivotal-act AGI.
Things I know that will still make this hard:
Inner alignment
Ontological crises
Wireheading
Things we don’t have to solve:
Corrigibility
Low impact (although if we had a solution for low impact, we might just try to tack it on to the resulting agent and and find out whether it works)
Value learning
You may get massive s-risk at comparatively little potential benefit with this. On many people’s values, the future you describe may not be particularly good anyway, and there’s an increased risk of something going wrong because you’d be trying a desperate effort with something you’d not fully understand.
Ah, I forgot to add that this is a potential s-risk. Yeah.
Although I disagree that that future would be close to zero. My values tell me it would be at least a millionth as good as the optimal future, and at least a million times more valuable than a completely consciousness-less universe.