It just occurred to me that you could sidestep the question and instead define P_lim(x) directly. For example, you could say something like
P_lim(x) = P(PA+x+y is consistent | PA+y is consistent)
if y is randomly chosen from all statements in PA using a simple length-based prior. This way P_lim(x)=1 if PA proves X, and P_lim(x)=0 if PA disproves X, as expected. Then you could view your original definition as a sort of approximation scheme for this one.
This way of assigning probabilities as “degrees of consistency” doesn’t seem to contain much more information than PA about the “true integers”, but if we just need a prior, something like it could work.
It doesn’t seem to me like your prior needs to encode information about the “true integers” any more than it needs to encode information about the world. It just needs to be flexible enough to see the structure that is there.
Maybe this is what you mean by “if we just need a prior,” but it seems like an important point.
While we’re at it, it’s not even clear our prior needs to capture PA. Maybe you could use something much weaker, perhaps even a theory so weak it could assert its own (cut free) consistency, and then just learn the structure of PA by observing some computations (or however you gain info about logical facts).
It just occurred to me that you could sidestep the question and instead define P_lim(x) directly. For example, you could say something like
P_lim(x) = P(PA+x+y is consistent | PA+y is consistent)
if y is randomly chosen from all statements in PA using a simple length-based prior. This way P_lim(x)=1 if PA proves X, and P_lim(x)=0 if PA disproves X, as expected. Then you could view your original definition as a sort of approximation scheme for this one.
This way of assigning probabilities as “degrees of consistency” doesn’t seem to contain much more information than PA about the “true integers”, but if we just need a prior, something like it could work.
It doesn’t seem to me like your prior needs to encode information about the “true integers” any more than it needs to encode information about the world. It just needs to be flexible enough to see the structure that is there.
Maybe this is what you mean by “if we just need a prior,” but it seems like an important point.
While we’re at it, it’s not even clear our prior needs to capture PA. Maybe you could use something much weaker, perhaps even a theory so weak it could assert its own (cut free) consistency, and then just learn the structure of PA by observing some computations (or however you gain info about logical facts).