I’m familiar with Solomonoff induction. I don’t think it can be used to do what you want it to do (though open to be convinced otherwise), which is why I’m trying to ask you to spell out in detail how you think the highly formal mathematical machinery could be applied in principle to a real-world case like this one. In particular, I’m trying to ascertain how exactly you bridge the gap—in a general way—between the purely syntactic algorithmic complexity of a sequence of English letters and the relative probability of the statement that sequence semantically represents.
There is no “gap-bridging”. Solomonoff induction gives you the probability of a sequence of symbols. In practice that is typically applied to the sense data streams of agents (such as produced by camera or microphone), to give estimates of their probabilities. Solomonoff induction knows nothing of semantics—it just works on symbol sequences.
Solomonoff induction knows nothing of semantics—it just works on symbol sequences.
Yes, and that’s the source of the problem I was attempting to get at. Solomonoff induction works on sequences of symbols. Julia Roberts being in New York is not a sequence of symbols, although “Julia Roberts being in New York” is. The correct “epistemic” prior probability of the former is not simply synonymous with the “algorithmic” probability of generating the latter, at least not in the way that “bachelor” is synonymous with “unmarried male.” The question therefore is how the two are related, and it seems like the relationship you’re proposing is that they’re equal. But that’s a really bad rule, I think, because we don’t want the probability of Julia Roberts’ location to vary with our language or orthography. So we need something more sophisticated, which is what I’m asking you for.
I don’t know of any plausible, objective, truly general methods of calculating priors. Solomonoff induction or whatever isn’t going to help very much.
IMO, Solomonoff induction is pretty plausible, objective and general—though it does inevitably depend on a choice of language, and has the minor issue of being uncomputable. Your objections appear to be expecting too much of it. The point of my first reply was not so much to point at Solomonoff induction, but rather to emplhasize alll the subsequent updates pertaining to the issue—which in a case like this would swamp the prior.
I definitely think it’s plausible in some cases, particularly certain mathematical ones. However, I don’t see any reason whatsoever to imagine that our meagre updates swamp the prior for something like Julia Roberts’ location across most/all languages.
Well, the Solomonoff prior is pretty hopeless in this case. On the other hand, many know what language Julia Roberts speaks, where she is from and how big and celebrity-friendly these cities are. Experience gives us most information on this issue, I feel.
Is there some reason to suspect there isn’t some crazy, gerrymandered orthography such that those facts don’t swamp the priors? Or that, in general, for any two incompatible claims X and Y together with our evidence E, there aren’t two finitely specified orthographies which 1. differ in the relative algorithmic prior probabilities of the translations of X and Y into the orthographies and 2. have this difference survive conditionalizing on E? Because if so, we’re still stuck with a really nasty relativism if Solomonoff is the last word on priors.
Is there some reason to suspect there isn’t some crazy, gerrymandered orthography such that those facts don’t swamp the priors?
There are certainly pathological reference machines—but that is only an issue if people use them.
Because if so, we’re still stuck with a really nasty relativism if Solomonoff is the last word on priors.
Well, I already agreed that Solomonoff induction depends on a choice of language. There are not too many arguments over this, though—people can usually agree on some simple reference machine.
It seems like you’re saying that, pragmatically speaking, it’s not a problem if we all settle on the same set of formalisms. But I don’t see how that’s relevant to my point, which is that there’s no real objective constraints on the formalism we use, and what’s more, any given formalism could lead to virtually any prior between 0 and 1 for any proposition. So, as I said earlier, Solomonoff doesn’t help very much in objectively guiding our priors. We could just dispense with this Solomonoff business entirely and say, “The problem of priors isn’t an issue if we all just arbitrarily choose the same priors!”
Sure there are. Use a sufficiently far out reference machine and things go haywire, and you no-longer get a useful implementaion of Occam’s razor.
Key word there being “useful.” “Useful” doesn’t translate to “objectively correct.” Lots of totally arbitrarily set priors are useful, I’m sure, so if that’s your standard, then this whole discussion is again redundant. Anyway, the fact that Occam’s razor-as-we-intuit-it falls out of one arbitrary configuration of the paramaters (reference machine, language and orthography) of the theory isn’t in itself evidence that the theory is amazingly useful, or even particularly true. It could just be evidence that the theory is particularly vulnerable to gerrymandering, and could theoretically be configured to support virtually anything. There is, I believe, a certain polynomial inequality that characterizes the set of primes. But that turns out not to be so interesting, since every set of integers corresponds to a similar such equation.
Read up about Solomonoff induction, by the sound of it. I gave you one link, and here is another one. You will need to use a computable approximation.
I’m familiar with Solomonoff induction. I don’t think it can be used to do what you want it to do (though open to be convinced otherwise), which is why I’m trying to ask you to spell out in detail how you think the highly formal mathematical machinery could be applied in principle to a real-world case like this one. In particular, I’m trying to ascertain how exactly you bridge the gap—in a general way—between the purely syntactic algorithmic complexity of a sequence of English letters and the relative probability of the statement that sequence semantically represents.
There is no “gap-bridging”. Solomonoff induction gives you the probability of a sequence of symbols. In practice that is typically applied to the sense data streams of agents (such as produced by camera or microphone), to give estimates of their probabilities. Solomonoff induction knows nothing of semantics—it just works on symbol sequences.
Yes, and that’s the source of the problem I was attempting to get at. Solomonoff induction works on sequences of symbols. Julia Roberts being in New York is not a sequence of symbols, although “Julia Roberts being in New York” is. The correct “epistemic” prior probability of the former is not simply synonymous with the “algorithmic” probability of generating the latter, at least not in the way that “bachelor” is synonymous with “unmarried male.” The question therefore is how the two are related, and it seems like the relationship you’re proposing is that they’re equal. But that’s a really bad rule, I think, because we don’t want the probability of Julia Roberts’ location to vary with our language or orthography. So we need something more sophisticated, which is what I’m asking you for.
You started out with:
IMO, Solomonoff induction is pretty plausible, objective and general—though it does inevitably depend on a choice of language, and has the minor issue of being uncomputable. Your objections appear to be expecting too much of it. The point of my first reply was not so much to point at Solomonoff induction, but rather to emplhasize alll the subsequent updates pertaining to the issue—which in a case like this would swamp the prior.
I definitely think it’s plausible in some cases, particularly certain mathematical ones. However, I don’t see any reason whatsoever to imagine that our meagre updates swamp the prior for something like Julia Roberts’ location across most/all languages.
Well, the Solomonoff prior is pretty hopeless in this case. On the other hand, many know what language Julia Roberts speaks, where she is from and how big and celebrity-friendly these cities are. Experience gives us most information on this issue, I feel.
Is there some reason to suspect there isn’t some crazy, gerrymandered orthography such that those facts don’t swamp the priors? Or that, in general, for any two incompatible claims X and Y together with our evidence E, there aren’t two finitely specified orthographies which 1. differ in the relative algorithmic prior probabilities of the translations of X and Y into the orthographies and 2. have this difference survive conditionalizing on E? Because if so, we’re still stuck with a really nasty relativism if Solomonoff is the last word on priors.
There are certainly pathological reference machines—but that is only an issue if people use them.
Well, I already agreed that Solomonoff induction depends on a choice of language. There are not too many arguments over this, though—people can usually agree on some simple reference machine.
It seems like you’re saying that, pragmatically speaking, it’s not a problem if we all settle on the same set of formalisms. But I don’t see how that’s relevant to my point, which is that there’s no real objective constraints on the formalism we use, and what’s more, any given formalism could lead to virtually any prior between 0 and 1 for any proposition. So, as I said earlier, Solomonoff doesn’t help very much in objectively guiding our priors. We could just dispense with this Solomonoff business entirely and say, “The problem of priors isn’t an issue if we all just arbitrarily choose the same priors!”
Sure there are. Use a sufficiently far out reference machine and things go haywire, and you no-longer get a useful implementaion of Occam’s razor.
Not really: in many cases, if the proposition and language are selected, everyone agrees on the result.
Solomonoff induction is just a formalisation of Occam’s razor, which IMO, is very useful for selecting priors.
Key word there being “useful.” “Useful” doesn’t translate to “objectively correct.” Lots of totally arbitrarily set priors are useful, I’m sure, so if that’s your standard, then this whole discussion is again redundant. Anyway, the fact that Occam’s razor-as-we-intuit-it falls out of one arbitrary configuration of the paramaters (reference machine, language and orthography) of the theory isn’t in itself evidence that the theory is amazingly useful, or even particularly true. It could just be evidence that the theory is particularly vulnerable to gerrymandering, and could theoretically be configured to support virtually anything. There is, I believe, a certain polynomial inequality that characterizes the set of primes. But that turns out not to be so interesting, since every set of integers corresponds to a similar such equation.