It seems to me that doing that without “losing important aspects of what H values” would lead to something human-like anyway (though maybe not an exact imitation of H), because of complexity of value. Basically after the first step you get human-like entities running on computers. Then they can prevent AI risk and carefully figure out what to do next, same as a team of uploads. So the first step looks strategically similar to uploading, and solving stability for further steps might be unnecessary.
The resulting agent is supposed to be trying to help H get what its wants, but won’t generally encode most of H’s values directly (it will only encode them indirectly as “what the operator wants”).
I agree that Ajeya’s description in that paragraph is problematic (though I think the descriptions in the body of the post were mostly fine), will probably correct it.
Then I’m not sure I understand how the scheme works. If all questions about values are punted to the single living human at the top, won’t that be a bottleneck for any complex plan?
It seems to me that doing that without “losing important aspects of what H values” would lead to something human-like anyway (though maybe not an exact imitation of H), because of complexity of value. Basically after the first step you get human-like entities running on computers. Then they can prevent AI risk and carefully figure out what to do next, same as a team of uploads. So the first step looks strategically similar to uploading, and solving stability for further steps might be unnecessary.
The resulting agent is supposed to be trying to help H get what its wants, but won’t generally encode most of H’s values directly (it will only encode them indirectly as “what the operator wants”).
I agree that Ajeya’s description in that paragraph is problematic (though I think the descriptions in the body of the post were mostly fine), will probably correct it.
Then I’m not sure I understand how the scheme works. If all questions about values are punted to the single living human at the top, won’t that be a bottleneck for any complex plan?