This isn’t meant to be an unanswerable question, I suspect it’s answerable, I’m asking if you have any particular ideas about the mechanics.
Oh, OK. (This was in fact my main confusion about the question.)
One thing is clear to me: in the case that I’ve made a closed-world assumption about my domain (which is what the least-set description of natural numbers is), I should somehow set up the probabilities so that a hypothesis which proves the existence of a domain element with a specific property does not get an a priori chunk of probability mass just for being a hypothesis of a specific length. Ideally, we want to set this up so that it follows “something like” the halting distribution which I mentioned in my previous reply (but of course that’s not computable).
One idea is that such a hypothesis only gets probability mass from specific domain elements with that property. So, for example, if we want to measure the belief that the Goldbach conjecture is false, the probability mass we assign to that must come from the mass we assign to conjectures like false_at(1000), false_at(1002), et cetera. As I eliminate possibilities like this, the probability for exists x. false_at(x) falls.
This can be formalized by saying that for closed-world domains, as we form a theory, we must have an example before we introduce an existential statement. (I’m assuming that we want to model the probability of a theory as some process which generates theories at random, as in my recent AGI paper.) We might even want to say that we need a specific example before we will introduce any quantifiers. This doesn’t strictly rule out any theories, since we can just write down examples that we don’t yet know to be false before introducing the quantified statements, but it modifies the probabilities associated with them.
However, I’m not really confident in that particular trick (I’m not sure it behaves correctly for nested quantifiers, et cetera). I would be more confident if there were an obvious way to generalize this to work nicely with both open-world and closed-world cases (and mixed cases). But I know that it is not possible to generalize “correctly” to full 2nd-order logic (for any definition of “correctly” that I’ve thought of so far).
Oh, OK. (This was in fact my main confusion about the question.)
One thing is clear to me: in the case that I’ve made a closed-world assumption about my domain (which is what the least-set description of natural numbers is), I should somehow set up the probabilities so that a hypothesis which proves the existence of a domain element with a specific property does not get an a priori chunk of probability mass just for being a hypothesis of a specific length. Ideally, we want to set this up so that it follows “something like” the halting distribution which I mentioned in my previous reply (but of course that’s not computable).
One idea is that such a hypothesis only gets probability mass from specific domain elements with that property. So, for example, if we want to measure the belief that the Goldbach conjecture is false, the probability mass we assign to that must come from the mass we assign to conjectures like false_at(1000), false_at(1002), et cetera. As I eliminate possibilities like this, the probability for exists x. false_at(x) falls.
This can be formalized by saying that for closed-world domains, as we form a theory, we must have an example before we introduce an existential statement. (I’m assuming that we want to model the probability of a theory as some process which generates theories at random, as in my recent AGI paper.) We might even want to say that we need a specific example before we will introduce any quantifiers. This doesn’t strictly rule out any theories, since we can just write down examples that we don’t yet know to be false before introducing the quantified statements, but it modifies the probabilities associated with them.
However, I’m not really confident in that particular trick (I’m not sure it behaves correctly for nested quantifiers, et cetera). I would be more confident if there were an obvious way to generalize this to work nicely with both open-world and closed-world cases (and mixed cases). But I know that it is not possible to generalize “correctly” to full 2nd-order logic (for any definition of “correctly” that I’ve thought of so far).