I suppose the Bayesian answer to that is that a probability distribution is a description of one’s knowledge, and that in principle, every state of knowledge, including total ignorance, can be represented as a prior distribution. In practice, one may not know how to do that. Fundamentalist Bayesians say that that is a weakness in our knowledge, while everyone else, from weak Bayesians to Sunday Bayesians, crypto-frequentists, and ardent frequentists, say it’s a weakness of Bayesian reasoning. Not being a statistician, I don’t need to take a view, although I incline against arguments deducing impossibility from ignorance.
I don’t have any strong disagreements there. But consider: if we can learn well even without assuming any distribution or prior, isn’t that worth exploring? The fact that there is an alternative to Bayesianism—one that we can prove works (in some well-defined settings), and isn’t just naive frequentism—is pretty fascinating, isn’t it?
What are you contrasting with learning?
I’m contrasting randomized vs. deterministic algorithms, which Eliezer discussed in your linked article, with Bayesian vs. PAC learning models. The randomized vs. deterministic question shouldn’t really be considered learning, unless you want to call things like primality testing “learning”.
I don’t have any strong disagreements there. But consider: if we can learn well even without assuming any distribution or prior, isn’t that worth exploring? The fact that there is an alternative to Bayesianism—one that we can prove works (in some well-defined settings), and isn’t just naive frequentism—is pretty fascinating, isn’t it?
I’m contrasting randomized vs. deterministic algorithms, which Eliezer discussed in your linked article, with Bayesian vs. PAC learning models. The randomized vs. deterministic question shouldn’t really be considered learning, unless you want to call things like primality testing “learning”.