One concern I have with the advisor idea (which probably doesn’t apply to Eliezer’s reinterpretation, if I understand that correctly, which I might not) is that it’s not clear that extrapolated advisors in parliament would actually act in the interests of the original agent. For example, they might be selfish and choose something like prolonging their existence by debating as long as possible. Or each might trivially argue for the life that would lead the agent to resemble them as closely as possible on the theory that that would give their existence more measure (which probably wouldn’t be too bad if the extrapolations are well-chosen, but likely not the best outcome). Or they might decide that this agent isn’t really them in the first place, so they should just make the agent’s life as amusing as possible.
A more general statement of the problem would be that there’s no guarantee that the extrapolation of the agent would optimize something beneficial to the original agent, and in fact most of the work of coming up with good advice (or good outcomes as the case may be) is probably being done by the extrapolation/idealization process if it is being done at all.
One concern I have with the advisor idea (which probably doesn’t apply to Eliezer’s reinterpretation, if I understand that correctly, which I might not) is that it’s not clear that extrapolated advisors in parliament would actually act in the interests of the original agent. For example, they might be selfish and choose something like prolonging their existence by debating as long as possible. Or each might trivially argue for the life that would lead the agent to resemble them as closely as possible on the theory that that would give their existence more measure (which probably wouldn’t be too bad if the extrapolations are well-chosen, but likely not the best outcome). Or they might decide that this agent isn’t really them in the first place, so they should just make the agent’s life as amusing as possible.
A more general statement of the problem would be that there’s no guarantee that the extrapolation of the agent would optimize something beneficial to the original agent, and in fact most of the work of coming up with good advice (or good outcomes as the case may be) is probably being done by the extrapolation/idealization process if it is being done at all.