Eliezer,
Ah, so you are a constructivist, perhaps even an intuitionist? Even so, the point of such theorems is that they can happen in a long transient within finite constraints, with the biggie here being the non-connectedness of the support. One can get stuck in a cycle going nowhere for a long time, just as in such phenomena as transient chaos. With a suitably large, but finite, dimensionality and a disconnected support, one can wander in a wilderness with not much serious convergence for a very long time.
I find the idea of a “prior learning” to be a bit weird. It is an agent who learns, although the prior the agent walks in with will certainly play a role in the ability of the agent to learn. But the problem of inertia that I raised has more to do with the nature of agents than with their priors.
Getting to the raison d’etre of this blog, the question here is does bias arise from the nature of the prior an agent brings to a decision or analytical process, or is it something about the open-mindedness or willing to adjust posteriors in the face of evidence that is more important? Presumably both are playing at least some role.
Hal,
You are being a bad boy. In his earlier discussion Eliezer made it clear that he did not approve of this terminology of “updating priors.” One has posterior probability distributions. The prior is what one starts with. However, Eliezer has also been a bit confusing with his occasional use of such language as a “prior learning.” I repeat, agents learn, not priors, although in his view of the post-human computerized future, maybe it will be computerized priors that do the learning.
The only way one is going to get “wrong learning” at least somewhat asymptotically is if the dimensionality is high and the support is disconnected. Eliezer is right that if one starts off with a prior that is far enough off, one might well have “wrong learning,” at least for awhile. But, unless the conditions I just listed hold, eventually the learning will move in the right direction and head towards the correct answer, or probability distribution, at least that is what Bayes’ Theorem asserts.
OTOH, the reference to “deep Bayesianism” raises another issue, that of fundamental subjectivism. There is this deep divide among Bayesians between the ones that are ultimately classical frequentists but who argue that Bayesian methods are a superior way of getting to the true objective distribution, and the deep subjectivist Bayesians. For the latter, there are no ultimately “true” probability distributions. We are always estimating something derived out of our subjective priors as updated by more recent information, wherever those priors came from.
Also, saying a prior should the known probability distribution, say of cancer victims, assumes that this probability is somehow known. The prior is always subject to how much information the assumer of a prior has when they being their process of estimation.