So the person in the thought experiment doesn’t expect to agree with a book’s conclusion, before reading it.
No he expects that if he reads the book, his posterior belief in the proposition is likely going to be high. But his current prior belief in the truth of the proposition is low.
Also, as I made clear in my update, AI is not perfect, merely very good. I only need it to be good enough for the whole episode to go through, i.e. that you don’t argue that a rational person will never believe in Z after reading the book and my story is implausible.
No he expects that if he reads the book, his posterior belief in the proposition is likely going to be high. But his current prior belief in the truth of the proposition is low.
Also, as I made clear in my update, AI is not perfect, merely very good. I only need it to be good enough for the whole episode to go through, i.e. that you don’t argue that a rational person will never believe in Z after reading the book and my story is implausible.
So in other words, the person is expecting to be persuaded by something other than the truth. Perhaps on the basis that the last N times he read one of these books, it changed his mind.
In that case, it is no different than if the person were stepping into a brain modification booth, and having his mind altered directly. Because a rational person would simply not be conned by this process. He would see that he currently believes in the existence of the flying spaghetti monster, and that he just read a book on the flying spaghetti monster prepared by a superintelligent AI which he had asked to prepare for him ultra-persuasive but entirely biased collections of evidence, and remember that he didn’t formerly believe in the flying spaghetti monster. He would conclude on this basis that his belief probably has no basis in reality, i.e. is inaccurate, and stop believing (with such high probability) in it.
If we are to accept that the AI is good enough to prevent this happening—a necessary premise of the thought experiment—then it must be preventing the person from being rational in this way, perhaps by including statements in the book that in some extraordinary way reprogram his mind via some backdoor vulnerability. Let’s say that perhaps the person is an android creating by the AI for its own amusement, which responds to certain phrases with massive anomalous changes in its brain wiring. That is simply the only way I can accept the premises that:
a) the person applies Bayes’s theorem properly (if this is not true, then he is simply not “mentally consistent” as you said)
b) he is aware that the books are designed to persuade him with high probability
c) he believes that the propositions to be proven in the books are untrue in general
d) he believes with high probability that the books will persuade him
which, unless I am very much mistaken, are equivalent to your statements of the problem.
If reading a book is not basically equivalent to submitting knowingly to brain modification for belief in something, then one of the above is untrue, i.e. the premises are inconsistent and the thought experiment can tell us nothing.
Remember that you are trying to prove than “one can really intentionally deceive oneself and be in a mentally consistent (although weird) state”. I accept that there is nothing mentally inconsistent about submitting to have one’s beliefs changed by brain surgery in one’s sleep. But accepting the fact that “intentionally deceiving oneself” is just three words that could be applied to any referent, I don’t think that your apparent referent is what Eliezer was talking about in the post you linked to. So you haven’t refuted him.
“Intentionally deceiving oneself” in his discussion means “deciding that one should believe something, and then (using the mundane tools available to us now, like reading books, chanting mantras, going to church, meditating etc.) forcing oneself to believe something else”. This may be possible in the trivial sense that 0 is not a probability, but in a practical sense it is basically “impossible” and that is all that Eliezer was arguing.
I’m sure Eliezer and anyone else would agree that it is possible to be an ideal Bayesian, and step in a booth in order to have oneself modified to believe in the flying spaghetti monster. It does seem to me that in order for the booth to work, it is going to have to turn you into a non-Bayesian irrational person, erase all of your memories about these booths or install false beliefs about the booths and then implant barriers in your mind to prevent the rest of your brain from changing this belief. It seems like a very difficult problem to me – but then we are talking about a superintelligent AI! In fact I expect that you’d need to be altered so much that you couldn’t even expect to be approximately the same person after leaving the booth.
Incidentally, this reminds me of a concept discussed in Greg Egan’s book “Quarantine”, which you might find interesting.
EDIT:
On re-reading, I see that the modification process as I described it doesn’t actually uphold the premises of your thought experiment, because only one iteration of book-reading could occur before the person is no longer “mentally consistent” i.e. rational, and he can’t ever read more than one of the books either (since his beliefs about or knowledge of the books themselves have been changed—which is not what he asked of the AI). So in order for the premises to be consistent, the book-programming-brain-surgery would have to completely wipe his mind and build a set of experiences from scratch so as to make the Universe seem consistent with evidence of the flying spaghetti monster, without it having to turn him into a non-Bayesian. The person would have to have evidence that the AI is clever enough that he should believe that it will be able to make books that persuade him of anything. And the AI would probably reset his mind at a point where he believes that he has never actually read any of the books yet.
What if the person realises that this exact scenario might already have happened? If the person was aware of the existence of this AI, and that he was in the business of asking it to do things liable to change his mind, I don’t suppose that the line of reasoning I have outlined here would be hard for him to arrive at. This would be likely to undermine his belief in the reality of his entire life experiences in general, lowering his degree of belief in any particular deity. I suppose the easiest way around this would be if the AI were to make him sufficiently unintelligent that he doesn’t come to suspect this, but is just barely capable of understanding the idea of a really smart being that can make “books” to persuade him of things (bearing in mind that according to the premises of the thought experiment, he has to be mentally consistent, i.e. Bayesian, and cannot have arbitrary barriers erected inside in his mind).
No he expects that if he reads the book, his posterior belief in the proposition is likely going to be high. But his current prior belief in the truth of the proposition is low.
Also, as I made clear in my update, AI is not perfect, merely very good. I only need it to be good enough for the whole episode to go through, i.e. that you don’t argue that a rational person will never believe in Z after reading the book and my story is implausible.
So in other words, the person is expecting to be persuaded by something other than the truth. Perhaps on the basis that the last N times he read one of these books, it changed his mind.
In that case, it is no different than if the person were stepping into a brain modification booth, and having his mind altered directly. Because a rational person would simply not be conned by this process. He would see that he currently believes in the existence of the flying spaghetti monster, and that he just read a book on the flying spaghetti monster prepared by a superintelligent AI which he had asked to prepare for him ultra-persuasive but entirely biased collections of evidence, and remember that he didn’t formerly believe in the flying spaghetti monster. He would conclude on this basis that his belief probably has no basis in reality, i.e. is inaccurate, and stop believing (with such high probability) in it.
If we are to accept that the AI is good enough to prevent this happening—a necessary premise of the thought experiment—then it must be preventing the person from being rational in this way, perhaps by including statements in the book that in some extraordinary way reprogram his mind via some backdoor vulnerability. Let’s say that perhaps the person is an android creating by the AI for its own amusement, which responds to certain phrases with massive anomalous changes in its brain wiring. That is simply the only way I can accept the premises that:
a) the person applies Bayes’s theorem properly (if this is not true, then he is simply not “mentally consistent” as you said)
b) he is aware that the books are designed to persuade him with high probability
c) he believes that the propositions to be proven in the books are untrue in general
d) he believes with high probability that the books will persuade him
which, unless I am very much mistaken, are equivalent to your statements of the problem.
If reading a book is not basically equivalent to submitting knowingly to brain modification for belief in something, then one of the above is untrue, i.e. the premises are inconsistent and the thought experiment can tell us nothing.
Remember that you are trying to prove than “one can really intentionally deceive oneself and be in a mentally consistent (although weird) state”. I accept that there is nothing mentally inconsistent about submitting to have one’s beliefs changed by brain surgery in one’s sleep. But accepting the fact that “intentionally deceiving oneself” is just three words that could be applied to any referent, I don’t think that your apparent referent is what Eliezer was talking about in the post you linked to. So you haven’t refuted him.
“Intentionally deceiving oneself” in his discussion means “deciding that one should believe something, and then (using the mundane tools available to us now, like reading books, chanting mantras, going to church, meditating etc.) forcing oneself to believe something else”. This may be possible in the trivial sense that 0 is not a probability, but in a practical sense it is basically “impossible” and that is all that Eliezer was arguing.
I’m sure Eliezer and anyone else would agree that it is possible to be an ideal Bayesian, and step in a booth in order to have oneself modified to believe in the flying spaghetti monster. It does seem to me that in order for the booth to work, it is going to have to turn you into a non-Bayesian irrational person, erase all of your memories about these booths or install false beliefs about the booths and then implant barriers in your mind to prevent the rest of your brain from changing this belief. It seems like a very difficult problem to me – but then we are talking about a superintelligent AI! In fact I expect that you’d need to be altered so much that you couldn’t even expect to be approximately the same person after leaving the booth.
Incidentally, this reminds me of a concept discussed in Greg Egan’s book “Quarantine”, which you might find interesting.
EDIT:
On re-reading, I see that the modification process as I described it doesn’t actually uphold the premises of your thought experiment, because only one iteration of book-reading could occur before the person is no longer “mentally consistent” i.e. rational, and he can’t ever read more than one of the books either (since his beliefs about or knowledge of the books themselves have been changed—which is not what he asked of the AI). So in order for the premises to be consistent, the book-programming-brain-surgery would have to completely wipe his mind and build a set of experiences from scratch so as to make the Universe seem consistent with evidence of the flying spaghetti monster, without it having to turn him into a non-Bayesian. The person would have to have evidence that the AI is clever enough that he should believe that it will be able to make books that persuade him of anything. And the AI would probably reset his mind at a point where he believes that he has never actually read any of the books yet.
What if the person realises that this exact scenario might already have happened? If the person was aware of the existence of this AI, and that he was in the business of asking it to do things liable to change his mind, I don’t suppose that the line of reasoning I have outlined here would be hard for him to arrive at. This would be likely to undermine his belief in the reality of his entire life experiences in general, lowering his degree of belief in any particular deity. I suppose the easiest way around this would be if the AI were to make him sufficiently unintelligent that he doesn’t come to suspect this, but is just barely capable of understanding the idea of a really smart being that can make “books” to persuade him of things (bearing in mind that according to the premises of the thought experiment, he has to be mentally consistent, i.e. Bayesian, and cannot have arbitrary barriers erected inside in his mind).
It seems that this thought experiment has turned out to be an example of the hidden complexity of wishes!