I’m not sure what you don’t understand, so I’ll explain a few things in that area and hope I hit the right one:
I give sentences their english name in the example to make it understandable. Here are two ways you could give more detail on the example scenario, each of which is consistent:
“It’s raining” is just the english name for a complicated construct in a database query language, used to be understandable. It’s connected to the epistemology module because the machine stores its knowledge in that database.
Actually, you are the interpreter and I’m the speaker. In that case, english in the interpreter language, and “It’s raining” is literally how you interpret observation1. It’s connected to your epistemology modules rainfall indicator… somehow? By your knowledge of english? In that example, observation1 might be “Bunthut sent the string “Es regnet”″.
Sentences in interpreter language are connected to the epistemology engine simply by supposition. The interpreter language is how the interpreter internally expresses its beliefs, otherwise it’s not the interpreter language.
“It’s raining” as a sentece of the interpreter language can’t be taken to mean “2+2=4“ because the interpreter language doesn’t need to be interpreted, the interpreter already understands it. “It’s raining” as a string sent by the speaker can be taken to mean “2+2=4”. It really depends on the prior—if you start out with a prior thats too wrong, you’ll end up with nonesense interpretations.
I don’t mean the internal language of the interpreter, I mean the external language, the human literally saying “it’s raining.” It seems like there’s some mystery process that connects observations to hypotheses about what some mysterious other party “really means”—but if this process ever connects the observations to propositions that are always true, it seems like that gets most favored by the update rule, and so “it’s raining” (spoken aloud) meaning 2+2=4 (in internal representation) seems like an attractor.
It seems like there’s some mystery process that connects observations to hypotheses about what some mysterious other party “really means”
The hypothesis do that. I said
We start out with a prior over hypothesis about meaning. Such a hypothesis generates a propability distribution over all propositions of the form “[Observation] means [propositon].” for each observation (including the possibility that the observation means nothing).
Why do you think this doesn’t answer your question?
but if this process always (sic) connects the observations to propositions that are always true, it seems like that gets most favored by the update rule, and so “it’s raining” (spoken aloud) meaning 2+2=4 (in internal representation) seems like an attractor.
The update rule doesn’t necessarily favor interpretation that make the speaker right. It favours interpretations that make the speakers meta-statements about meaning right—in the example case the speaker claims to mean true things, so these fall together. Still, does the problem not recur on a higher level? For example, a hypothesis that never interpreted the speaker to be making such meta-statements would have him never be wrong about that. Wouldn’t it dominate all hypothesis with meta-statements in them? No, because hypothesis aren’t rated individually. If I just took one hypothesis, got its interpretations for all the observations, and saw how likely that total interpretation was to make the speaker wrong about meta-statements, and updated based on that, then your problem would occur. But actually, the process for updating a hypothesis also depends on how likely you consider other hypothesis:
To update on an observation history, first we compute for each observation in it our summed prior distribution over what it means. Then, for each hypothesis in the prior, for each observation, take the hypothesis-distribution over its meaning, combine it with the prior-distribution over all the other observations, and calculate the propability that the speakers statements about what he meant were right. After you’ve done that for all observations, multiply them to get the score of that hypothesis. Multiply each hypothesis’s score with its prior and renormalize.
So if your prior gives most of its weight to hypothesis that interpret you mostly correctly, then the hypothesis that you never make meta-statements will also be judged by its consistency with those mostly correct interpretations.
I’m not sure what you don’t understand, so I’ll explain a few things in that area and hope I hit the right one:
I give sentences their english name in the example to make it understandable. Here are two ways you could give more detail on the example scenario, each of which is consistent:
“It’s raining” is just the english name for a complicated construct in a database query language, used to be understandable. It’s connected to the epistemology module because the machine stores its knowledge in that database.
Actually, you are the interpreter and I’m the speaker. In that case, english in the interpreter language, and “It’s raining” is literally how you interpret observation1. It’s connected to your epistemology modules rainfall indicator… somehow? By your knowledge of english? In that example, observation1 might be “Bunthut sent the string “Es regnet”″.
Sentences in interpreter language are connected to the epistemology engine simply by supposition. The interpreter language is how the interpreter internally expresses its beliefs, otherwise it’s not the interpreter language.
“It’s raining” as a sentece of the interpreter language can’t be taken to mean “2+2=4“ because the interpreter language doesn’t need to be interpreted, the interpreter already understands it. “It’s raining” as a string sent by the speaker can be taken to mean “2+2=4”. It really depends on the prior—if you start out with a prior thats too wrong, you’ll end up with nonesense interpretations.
I don’t mean the internal language of the interpreter, I mean the external language, the human literally saying “it’s raining.” It seems like there’s some mystery process that connects observations to hypotheses about what some mysterious other party “really means”—but if this process ever connects the observations to propositions that are always true, it seems like that gets most favored by the update rule, and so “it’s raining” (spoken aloud) meaning 2+2=4 (in internal representation) seems like an attractor.
The hypothesis do that. I said
Why do you think this doesn’t answer your question?
The update rule doesn’t necessarily favor interpretation that make the speaker right. It favours interpretations that make the speakers meta-statements about meaning right—in the example case the speaker claims to mean true things, so these fall together. Still, does the problem not recur on a higher level? For example, a hypothesis that never interpreted the speaker to be making such meta-statements would have him never be wrong about that. Wouldn’t it dominate all hypothesis with meta-statements in them? No, because hypothesis aren’t rated individually. If I just took one hypothesis, got its interpretations for all the observations, and saw how likely that total interpretation was to make the speaker wrong about meta-statements, and updated based on that, then your problem would occur. But actually, the process for updating a hypothesis also depends on how likely you consider other hypothesis:
So if your prior gives most of its weight to hypothesis that interpret you mostly correctly, then the hypothesis that you never make meta-statements will also be judged by its consistency with those mostly correct interpretations.