I’ve been reading about the difficult problem of building an intelligent agent A that can prove a more intelligent version of itself, A’, will behave according to A’s values. It made me start wondering: what does it mean when a person “proves” something to themselves or others? Is it the mental state change that’s important? The external manipulation of symbols?
Proof, in this case, means that using only a restricted set of rules, you are able to rewrite a set of initial assumptions to get the desired conclusion. The rules are supposed to conserve, every time they are used, the truth status of the assertions they are applied to. In this case, if the derivation is correct and both agents believe in the same environment logic, then the mental state change should be a consequence of the strict symbols manipulation. Note that ‘two agents’ might mean ‘the same agent in the past and in the future of the derivation’.
I’ve been reading about the difficult problem of building an intelligent agent A that can prove a more intelligent version of itself, A’, will behave according to A’s values. It made me start wondering: what does it mean when a person “proves” something to themselves or others? Is it the mental state change that’s important? The external manipulation of symbols?
Proof, in this case, means that using only a restricted set of rules, you are able to rewrite a set of initial assumptions to get the desired conclusion. The rules are supposed to conserve, every time they are used, the truth status of the assertions they are applied to.
In this case, if the derivation is correct and both agents believe in the same environment logic, then the mental state change should be a consequence of the strict symbols manipulation. Note that ‘two agents’ might mean ‘the same agent in the past and in the future of the derivation’.