If you were already completely certain that you were about to read a valid argument, and then you read that argument, then the meta stuff would completely cancel it out. If you were almost completely certain that you were about to read a valid argument, and then you read it, then the meta stuff would almost (but not completely) cancel it out. This is why reading the same argument twice in a row does not affect your confidence much more than reading it once does. But the less certain you were about the argument’s validity the first time, the more of an effect going over it again should have.
How can this be true when different arguments have different strength and you don’t know what the statement is? Here, suppose you believe that you are about to read a completely valid argument in support of conventional arithmetic. Please update your belief now.
Here is the statement: “2+2=4”.
What if it instead was Russell’s Principia Mathematica?
But you had assumed that the book would contain extremely strong arguments in favour of Zoroastrianism. Here strong means that P(Zoroastrianism is correct | argument is valid) is big, not that P(argument is valid) is big after reading the argument. (At least this is how I interpret your setting.) Both “all arguments in Principia Mathematica are correct” and “2+2=4″ have high probabilities of being true, but P(arithmetic is correct | all arguments in Principia are correct) is much higher than P(arithmetic is correct | 2+2=4).
We are running into meta issues that are really hard to wrap your head around. You believe that the book is likely to convince you, but it’s not absolutely guaranteed to. Whether it will do so surely depends on the actual arguments used. You’d expect, a priori, that if it argues for X which is more likely, its arguments would also be more convincing. But until you actually see the arguments, you don’t know that they will convince you. It depends on what they actually are. In your formulation, what happens if you read the book and the arguments do not convince you? Also, what if the arguments do not convince you, but only because you expect the book to be extremely convincing, is this different from the case of arguments taken without this meta-knowledge not convinving you?
I think I address some of these questions in another reply, but anyway, I will try a detailed description:
Let’s denote the following propositions:
Z = “Zoroastrianism is true.”
B = Some particular, previously unknown, statement included in the book. It is supposed to be evidence for Z. Let this be in form of propositions so that I am able to assign it a probability (e.g. B shouldn’t be a Pascal-wagerish extortion).
C(r) = “B is compelling to such extent that it shifts odds for Z by ratio r”. That is, C(r) = “P(B|Z) = r*P(B|not Z)”.
F = Unknown evidence against Z.
D(r) = “F shifts odds against Z by ratio r.”
Before reading the book
p(Z) is low
I may have a probability distribution for “B = S” (that is, “the convincing argument contained in the book is S”) over set of all possible S; but if I have it, it is implicit, in sense I have an algorithm which assigns p(B = S) for any given S, but haven’t gone through the whole huge set of all possible S—else the evidence in the book wouldn’t be new to me in any meaningful sense
I have p(S|Z) and p(S|not Z) for all S, implicitly like in the previous case
I can’t calculate the distribution p(C(r)) from p(B = S), p(S|Z) and p(S|not Z), since that would require calculating explicitly p(B = S) for every S, which is out of reach; however
I have obtained p(C(r)) by another means—knowledge about how the book is constructed—and p(C(r)) has most of its mass at pretty high values of r
by the same means I have obtained p(D(r)), which is distributed at as high or even higher values of r
Can I update the prior p(Z)? If I knew for certain that C(1,000) is true, I should take it into account and multiply the odds for Z by 1,000. If I knew that D(10,000) is true, I should analogically divide the odds by 10,000. Having probability distributions instead of certainty changes little—calculate the expected value* E(r) for both C and D and use that. If the values for C and D are similar or only differ a little (which is probably what we assume), then the updates approximately cancel out.
Now I read the book and find out what B is. This necessarily replaces my prior p(C(r)) by δ(r—R), where R = P(B|Z)/P(B|not Z). It can be that R is higher that the expected value E(r) calculated from p(C(r)), it can be lower too. If it is higher—the evidence in the book is more convincing, I will update upwards. If it is lower, I will update downwards. The odds would change by R/E(r). If my expectations of convincingness of arguments were correct, that is R = E(r), actually learning what B is does nothing with my probability distribution.
Potentially confounding factor is that our prior E(r) is usually very close to 1; if somebody tells me that he has a very convincing argument, I don’t really expect a convincing argument., because convincing arguments are rare and people regularly overestimate the width of audience who would find their favourite argument compelling. Therefore I normally need to hear the argument before updating significantly. But in this scenario it is stipulated that we already know in advance that the argument is convincing.
*) I suppose that E(r) here should be geometric mean of p(C(r)), but it can be another functional; it’s too late here to think about it.
I am not sure I completely follow, but I think the point is that you will in fact update the probability up if a new argument is more convincing than you expect. Since AI can better estimate what you expect it to do than you can estimate how convincing AI will make it, it will be able to make all arguments more convincing than you expect.
I think you are adding further specifications to the original setting. Your original description assumed that AI is a very clever arguer who constructs very persuasive deceptive arguments. Now you assume that AI actively tries to make the arguments more persuasive than you expect. You can stipulate for argument’s sake that AI can always make more convincing argument than you expect, but 1) it’s not clear whether it’s even possible in realistic circumstances, 2) it obscures the (interesting and novel) original problem (“is evidence of evidence equally valuable as the evidence itself?”) by rather standard Newcomb-like mind-reading paradox.
If you were already completely certain that you were about to read a valid argument, and then you read that argument, then the meta stuff would completely cancel it out. If you were almost completely certain that you were about to read a valid argument, and then you read it, then the meta stuff would almost (but not completely) cancel it out. This is why reading the same argument twice in a row does not affect your confidence much more than reading it once does. But the less certain you were about the argument’s validity the first time, the more of an effect going over it again should have.
How can this be true when different arguments have different strength and you don’t know what the statement is? Here, suppose you believe that you are about to read a completely valid argument in support of conventional arithmetic. Please update your belief now. Here is the statement: “2+2=4”. What if it instead was Russell’s Principia Mathematica?
But you had assumed that the book would contain extremely strong arguments in favour of Zoroastrianism. Here strong means that P(Zoroastrianism is correct | argument is valid) is big, not that P(argument is valid) is big after reading the argument. (At least this is how I interpret your setting.) Both “all arguments in Principia Mathematica are correct” and “2+2=4″ have high probabilities of being true, but P(arithmetic is correct | all arguments in Principia are correct) is much higher than P(arithmetic is correct | 2+2=4).
We are running into meta issues that are really hard to wrap your head around. You believe that the book is likely to convince you, but it’s not absolutely guaranteed to. Whether it will do so surely depends on the actual arguments used. You’d expect, a priori, that if it argues for X which is more likely, its arguments would also be more convincing. But until you actually see the arguments, you don’t know that they will convince you. It depends on what they actually are. In your formulation, what happens if you read the book and the arguments do not convince you? Also, what if the arguments do not convince you, but only because you expect the book to be extremely convincing, is this different from the case of arguments taken without this meta-knowledge not convinving you?
I think I address some of these questions in another reply, but anyway, I will try a detailed description:
Let’s denote the following propositions:
Z = “Zoroastrianism is true.”
B = Some particular, previously unknown, statement included in the book. It is supposed to be evidence for Z. Let this be in form of propositions so that I am able to assign it a probability (e.g. B shouldn’t be a Pascal-wagerish extortion).
C(r) = “B is compelling to such extent that it shifts odds for Z by ratio r”. That is, C(r) = “P(B|Z) = r*P(B|not Z)”.
F = Unknown evidence against Z.
D(r) = “F shifts odds against Z by ratio r.”
Before reading the book
p(Z) is low
I may have a probability distribution for “B = S” (that is, “the convincing argument contained in the book is S”) over set of all possible S; but if I have it, it is implicit, in sense I have an algorithm which assigns p(B = S) for any given S, but haven’t gone through the whole huge set of all possible S—else the evidence in the book wouldn’t be new to me in any meaningful sense
I have p(S|Z) and p(S|not Z) for all S, implicitly like in the previous case
I can’t calculate the distribution p(C(r)) from p(B = S), p(S|Z) and p(S|not Z), since that would require calculating explicitly p(B = S) for every S, which is out of reach; however
I have obtained p(C(r)) by another means—knowledge about how the book is constructed—and p(C(r)) has most of its mass at pretty high values of r
by the same means I have obtained p(D(r)), which is distributed at as high or even higher values of r
Can I update the prior p(Z)? If I knew for certain that C(1,000) is true, I should take it into account and multiply the odds for Z by 1,000. If I knew that D(10,000) is true, I should analogically divide the odds by 10,000. Having probability distributions instead of certainty changes little—calculate the expected value* E(r) for both C and D and use that. If the values for C and D are similar or only differ a little (which is probably what we assume), then the updates approximately cancel out.
Now I read the book and find out what B is. This necessarily replaces my prior p(C(r)) by δ(r—R), where R = P(B|Z)/P(B|not Z). It can be that R is higher that the expected value E(r) calculated from p(C(r)), it can be lower too. If it is higher—the evidence in the book is more convincing, I will update upwards. If it is lower, I will update downwards. The odds would change by R/E(r). If my expectations of convincingness of arguments were correct, that is R = E(r), actually learning what B is does nothing with my probability distribution.
Potentially confounding factor is that our prior E(r) is usually very close to 1; if somebody tells me that he has a very convincing argument, I don’t really expect a convincing argument., because convincing arguments are rare and people regularly overestimate the width of audience who would find their favourite argument compelling. Therefore I normally need to hear the argument before updating significantly. But in this scenario it is stipulated that we already know in advance that the argument is convincing.
*) I suppose that E(r) here should be geometric mean of p(C(r)), but it can be another functional; it’s too late here to think about it.
I am not sure I completely follow, but I think the point is that you will in fact update the probability up if a new argument is more convincing than you expect. Since AI can better estimate what you expect it to do than you can estimate how convincing AI will make it, it will be able to make all arguments more convincing than you expect.
I think you are adding further specifications to the original setting. Your original description assumed that AI is a very clever arguer who constructs very persuasive deceptive arguments. Now you assume that AI actively tries to make the arguments more persuasive than you expect. You can stipulate for argument’s sake that AI can always make more convincing argument than you expect, but 1) it’s not clear whether it’s even possible in realistic circumstances, 2) it obscures the (interesting and novel) original problem (“is evidence of evidence equally valuable as the evidence itself?”) by rather standard Newcomb-like mind-reading paradox.