Speaking of Chapter 1, it seems essential to point out another point that may be unclear on superficial reading.
The author introduces the notion of a reasoning “robot” that maintains a consistent set of “plausibility” values (probabilities) according to a small set of rules.
To a modern reader, it may make the impression that the author here suggests some practical algorithm or implementation of some artificial intelligence that uses Bayesian inference as a reasoning process.
I think, this misses the point completely. First: it is clear that maintaining such a system of probability values even for a set of simply Boolean formulas (consistently!) amounts to solving SAT problems and therefore computationally infeasible in general.
Rather, the author’s purpose of introducing the “robot” was to avoid the misconception that plausibility desiderata are some subjective, inaccurate notions that depend on some hidden features of the human mind. So by detaching the inference rule from the human mind and using a idealized “robot”, the author wants to argue that these axioms and their consequences can and should be studied mathematically and independently from all other features and aspects of human thinking and rationality.
So here the objective was not to build some intelligence, rather study an abstract and computationally unconstrained version of intelligence obeying the above principles alone.
Such an AI will never be realized in practice (due to inherent complexity limitations, and here I don’t just speak about P!=NP !), Still, if we can prove what this theoretical AI will have to do in certain specific situations, then we can learn important lessons about the above principles, or even guide our decisions by that insights we gained from that study.
I agree that Jaynes is using the robot as a literary device to get a point across.
If I understood you correctly it seems you’re sneaking an additional claim that a Bayesian AI is theoretically impossible due to computational concerns. That should be discussed separately, but the obvious counterargument is that while, say, complete inference in Bayes Nets has been proved intractable, approximate inference does well on good-size problems, and approximate does not mean it’s not Bayesian.
Sorry, I never tried to imply that an AI built on the Bayesian principles is impossible or even a bad idea. (Probably, using Bayesian inference is a fundamentally good idea.)
I just tried to point out that easy looking principles don’t necessarily translate to practical implementations in a straightforward manner.
What then do you make of Jayne’s observation in the Comments: “Our present model of the robot is quite literally real, because today it is almost universally true that any nontrivial probability evaluation is performed by a computer”?
In my reading it means, that there are already actual implementations for all probability inference operations that the authors consider in the book.
This has been probably a true statement, even in the 60′ies. It does not mean that the robot as a whole is resource-wise feasible.
An analogy: It is not hard to implement all (non-probabilistic) logical derivation rules. It is also straightforward to use them to generate all true mathematical theorems (e.g. within ZFC). However this does not imply that we have an practical (i.e. efficient) general purpose mathematical theorem-prover. It gives an algorithm to prove every provable theorems eventually, but its run-time consumption makes this approach practically useless.
I assume you mean in the sense that deciding satisfiability of arbitrary propositions (over uncertain variables; certainly true/false ones can be simplified out) is NP-complete. Of course I mean that a variable v is uncertain if 0<p(v)<1.
Actually, solving SAT problems is just the simplest case. Even so, if you have only certain variables (with either 0 or 1 plausibility), it’s still NP-complete, you can’t just simplify them in polynomial time. [EDIT: This is wrong as Jonathan pointed it out.]
In extreme case, since we also have the rule that “robot” has to use all the available information to the fullest extent, it means that the “robot” must be insanely powerful. For example if the calculation of some plausibility value depends for example the correctness of an algorithm (known by the “robot”, with a very high probability), then it will have to be able to solve the halting problem in general.
Even if you constrain your probability values to be never certain or impossible, you can always chose small (or large) enough values, so that the computation of the probabilities can be used to solve the discrete version of the problem.
For example, in the simplest case: if you just have a set of propositions in (let us say in conjunctive normal form), the consistency desideratum implies the ability of the “robot” to solve SAT problems, even if the starting plausibility values for the literals fall into the open (0,1) interval.
I think you misunderstood. The robot has a real number p(v) for every v. Let’s grant an absolute min and max of 0 and 1. My point was simply that when p(v)=0 or p(v)=1, v can be simplified out of propositions using it.
I understand why computing the probability of a proposition implies answering whether it’s satisfiable.
Speaking of Chapter 1, it seems essential to point out another point that may be unclear on superficial reading.
The author introduces the notion of a reasoning “robot” that maintains a consistent set of “plausibility” values (probabilities) according to a small set of rules.
To a modern reader, it may make the impression that the author here suggests some practical algorithm or implementation of some artificial intelligence that uses Bayesian inference as a reasoning process.
I think, this misses the point completely. First: it is clear that maintaining such a system of probability values even for a set of simply Boolean formulas (consistently!) amounts to solving SAT problems and therefore computationally infeasible in general.
Rather, the author’s purpose of introducing the “robot” was to avoid the misconception that plausibility desiderata are some subjective, inaccurate notions that depend on some hidden features of the human mind. So by detaching the inference rule from the human mind and using a idealized “robot”, the author wants to argue that these axioms and their consequences can and should be studied mathematically and independently from all other features and aspects of human thinking and rationality.
So here the objective was not to build some intelligence, rather study an abstract and computationally unconstrained version of intelligence obeying the above principles alone.
Such an AI will never be realized in practice (due to inherent complexity limitations, and here I don’t just speak about P!=NP !), Still, if we can prove what this theoretical AI will have to do in certain specific situations, then we can learn important lessons about the above principles, or even guide our decisions by that insights we gained from that study.
I agree that Jaynes is using the robot as a literary device to get a point across.
If I understood you correctly it seems you’re sneaking an additional claim that a Bayesian AI is theoretically impossible due to computational concerns. That should be discussed separately, but the obvious counterargument is that while, say, complete inference in Bayes Nets has been proved intractable, approximate inference does well on good-size problems, and approximate does not mean it’s not Bayesian.
Sorry, I never tried to imply that an AI built on the Bayesian principles is impossible or even a bad idea. (Probably, using Bayesian inference is a fundamentally good idea.)
I just tried to point out that easy looking principles don’t necessarily translate to practical implementations in a straightforward manner.
What then do you make of Jayne’s observation in the Comments: “Our present model of the robot is quite literally real, because today it is almost universally true that any nontrivial probability evaluation is performed by a computer”?
In my reading it means, that there are already actual implementations for all probability inference operations that the authors consider in the book.
This has been probably a true statement, even in the 60′ies. It does not mean that the robot as a whole is resource-wise feasible.
An analogy: It is not hard to implement all (non-probabilistic) logical derivation rules. It is also straightforward to use them to generate all true mathematical theorems (e.g. within ZFC). However this does not imply that we have an practical (i.e. efficient) general purpose mathematical theorem-prover. It gives an algorithm to prove every provable theorems eventually, but its run-time consumption makes this approach practically useless.
I assume you mean in the sense that deciding satisfiability of arbitrary propositions (over uncertain variables; certainly true/false ones can be simplified out) is NP-complete. Of course I mean that a variable v is uncertain if 0<p(v)<1.
Actually, solving SAT problems is just the simplest case. Even so, if you have only certain variables (with either 0 or 1 plausibility), it’s still NP-complete, you can’t just simplify them in polynomial time. [EDIT: This is wrong as Jonathan pointed it out.]
In extreme case, since we also have the rule that “robot” has to use all the available information to the fullest extent, it means that the “robot” must be insanely powerful. For example if the calculation of some plausibility value depends for example the correctness of an algorithm (known by the “robot”, with a very high probability), then it will have to be able to solve the halting problem in general.
Even if you constrain your probability values to be never certain or impossible, you can always chose small (or large) enough values, so that the computation of the probabilities can be used to solve the discrete version of the problem.
For example, in the simplest case: if you just have a set of propositions in (let us say in conjunctive normal form), the consistency desideratum implies the ability of the “robot” to solve SAT problems, even if the starting plausibility values for the literals fall into the open (0,1) interval.
I think you misunderstood. The robot has a real number p(v) for every v. Let’s grant an absolute min and max of 0 and 1. My point was simply that when p(v)=0 or p(v)=1, v can be simplified out of propositions using it.
I understand why computing the probability of a proposition implies answering whether it’s satisfiable.
Sorry for the confusion. I was very superficial. Of course, your are correct about being able to simplify out those values.