If A’s proof system is sound, that directly implies that A won’t generate “spurious” proofs.
Nope. In the example problem from the post, A’s proof system will happily prove statements like “A()==1 implies U()==100”. After all, A() finishes in finite time and returns 2, so any proof system worth its salt will eventually prove that A()==1 is false by simulating A, and a false statement implies anything. The challenge is to not let the agent stumble upon spurious proofs before it returns a value.
To make this post more obvious to people like me and JGWeissman, it would be a good idea to include this in the original post! I came out of this post thinking “Why did I just read that?” and couldn’t figure it out until this response.
Sorry, the omission was due to illusion of transparency on my part. I’d spent so much time thinking about avoiding spurious counterfactuals that I couldn’t imagine other people not seeing that as the most important problem :-)
If the program doesn’t terminate, the proof system may be unable to prove that. But if the program terminates in finitely many steps, any good enough proof system allows for a proof that just simulates all the steps one by one. Of course the program itself won’t reach that particular proof because the proof is too long.
Nope. In the example problem from the post, A’s proof system will happily prove statements like “A()==1 implies U()==100”. After all, A() finishes in finite time and returns 2, so any proof system worth its salt will eventually prove that A()==1 is false by simulating A, and a false statement implies anything. The challenge is to not let the agent stumble upon spurious proofs before it returns a value.
To make this post more obvious to people like me and JGWeissman, it would be a good idea to include this in the original post! I came out of this post thinking “Why did I just read that?” and couldn’t figure it out until this response.
OK, thanks! Included this reasoning in the post.
ahh, good, i did re-read stuff and i was like—did i skip some major piece of it? am i going insane lately?
Now it makes a lot more sense.
Sorry, the omission was due to illusion of transparency on my part. I’d spent so much time thinking about avoiding spurious counterfactuals that I couldn’t imagine other people not seeing that as the most important problem :-)
Not if A includes the proof search! No sound proof system will be able to say anything interesting about programs that use the system itself.
If the program doesn’t terminate, the proof system may be unable to prove that. But if the program terminates in finitely many steps, any good enough proof system allows for a proof that just simulates all the steps one by one. Of course the program itself won’t reach that particular proof because the proof is too long.