This agent cannot PROVE its own action, except possibly at the maximum proof length, as it would create a contradiction. The algorithm is too unpredictable for that.
An external agent with more understanding COULD prove agent()’s action from this.
The formal statement you’ve written is unclear; certainly, proving that the agent performs the action is not as straightforward as just acting on a moral argument (the agent would need to prove that [it proves the statement before deciding on the action], not just that the statement is true), but I’m not sure there are general barriers against being able to do so (I might be missing something).
Formal statements tend to be unclear when they’re really proofs that one’s attempts to place line breaks in somehow failed.
The problem is that proving that the agent performs the action proves that the agent does not perform some other action. From this, the agent can prove very high utility given that action. An agent which proves very high utility given an action does not take any other action.
So: An agent that proves that it takes an action, as long as it has enough time left to extend the proof a bit further, does not take that action.
That’s basically the reason consequences appear consistent.
If the agent can change its mind, then naturally it can’t prove that it won’t. I was assuming a situation where the agent is reasoning further after it has already decided on the action, so that further moral arguments are not.
In fact, one is not able to deduce agent()==1 from that. Doing so would create a contradiction because we would prove
~agent()==2 agent()==2 ⇒ world()=100000000000000000000000000 agent()==2
This agent cannot PROVE its own action, except possibly at the maximum proof length, as it would create a contradiction. The algorithm is too unpredictable for that.
An external agent with more understanding COULD prove agent()’s action from this.
The formal statement you’ve written is unclear; certainly, proving that the agent performs the action is not as straightforward as just acting on a moral argument (the agent would need to prove that [it proves the statement before deciding on the action], not just that the statement is true), but I’m not sure there are general barriers against being able to do so (I might be missing something).
Formal statements tend to be unclear when they’re really proofs that one’s attempts to place line breaks in somehow failed.
The problem is that proving that the agent performs the action proves that the agent does not perform some other action. From this, the agent can prove very high utility given that action. An agent which proves very high utility given an action does not take any other action.
So: An agent that proves that it takes an action, as long as it has enough time left to extend the proof a bit further, does not take that action.
To create explicit line breaks, add two spaces to the end of each line.
That’s basically the reason consequences appear consistent.
If the agent can change its mind, then naturally it can’t prove that it won’t. I was assuming a situation where the agent is reasoning further after it has already decided on the action, so that further moral arguments are not.
Oh, I see. I thought proving things about agent() and deciding what to do could be the same thing, but I was wrong.