In the meantime, it will not seek to influence the value of v (as any action increasing v also decreases −v by the same amount), but will seek to gather as many resources to put itself in a position to act once it knows the value of X.
Now, this definition is somewhat dependent on the definition of v (eg: it would certainly want to be elected “president of the committee for setting the value of v” more than anything else), so a more thorough description might be some situation where the agent is completely ignorant about its future utility (but where the ignorance is symmetric; ie the probability of −v for any v is the same as for v). This could be a pure resource gathering agent.
Why could this be interesting? Well, I was wondering if we could take a generic agent and somehow “subtract off” a pure resource gathering agent from it. So it would pursue its goals, while also minimising its success were it such an agent.
The idea needs some developing, but there might be something there.
Resource gathering agent
Here’s an idea (inspired by “Less exploitable value-updating agent”) of how to model an agent that does nothing but gather resources.
It’s an agent which has a utility function u=Xv, for some utility function v. It doesn’t know whether X=−1 or X=1 (maybe we used an approach like “Safe probability manipulation, superweapons, and stable self-improvement research” to guarantee ignorance), but it will find out tomorrow.
In the meantime, it will not seek to influence the value of v (as any action increasing v also decreases −v by the same amount), but will seek to gather as many resources to put itself in a position to act once it knows the value of X.
Now, this definition is somewhat dependent on the definition of v (eg: it would certainly want to be elected “president of the committee for setting the value of v” more than anything else), so a more thorough description might be some situation where the agent is completely ignorant about its future utility (but where the ignorance is symmetric; ie the probability of −v for any v is the same as for v). This could be a pure resource gathering agent.
Why could this be interesting? Well, I was wondering if we could take a generic agent and somehow “subtract off” a pure resource gathering agent from it. So it would pursue its goals, while also minimising its success were it such an agent.
The idea needs some developing, but there might be something there.