Consider proposing the most naïve formula for logical correlation[1].
Let a program p be a tuple of code for a Turing machine, intermediate
tape states after each command execution, and output. All in binary.
That is p=(c,t,o), with c∈{0,1}+,t∈({0,1}+)+ and o∈{0,1}+.
Let l=|t| be the number of steps that p takes to halt.
Then a formula for the logical correlation 合[2] of two halting
programs p1=(c1,t1,o1),p2=(c2,t2,o2), a tape-state
discount factor γ[3], and a string-distance metricd:{0,1}+×{0,1}+→N could be
The lower 合, the higher the logical correlation between p1
and p2. The minimal value is −0.5.
If d(o1,o2)<d(o1,o3), then it’s also the case that 合(p1,p2,γ)<合(p1,p3,γ).
One might also want to be able to deal with the fact that programs have
different trace lengths, and penalize that, e.g. amending the formula:
合′(p1,p2,γ)=合(p1,p2,γ)+2|l1−l2|
I’m a bit unhappy that the code doesn’t factor in the logical correlation,
and ideally one would want to be able to compute the logical correlation
without having to run the program.
Suggested by GPT-4. Stands for joining, combining, uniting. Also “to suit; to fit”, “to have sexual intercourse”, “to fight, to have a confrontation with”, or “to be equivalent to, to add up”.
Ideally one would want to be able to compute the logical correlation without having to run the program.
I think this isn’t possible in the general case. Consider two programs, one of which is “compute the sha256 digest of all 30 byte sequences and halt if the result is 9a56f6b41455314ff1973c72046b0821a56ca879e9d95628d390f8b560a4d803” and the other of which is “compute the md5 digest of all 30 byte sequences and halt if the result is 055a787f2fb4d00c9faf4dd34a233320”.
Any method that was able to compute the logical correlation between those would also be a program which at a minimum reverses all cryptograhic hash functions.
How close would this rank a program p with a universal Turing machine simulating p? My sense is not very close because the “same” computation steps on each program don’t align.
My “most naïve formula for logical correlation” would be something like put a probability distribution on binary string inputs, treat p1 and p2 as random variables {0,1}∗→{0,1}∗∪{⊥}, and compute their mutual information.
The strongest logical correlation is −0.5, the lower the better.
For p and sim(p), the logical correlation would be −ε, assuming that p and sim(p) have the same output. This is a pretty strong logical correlation.
This is because equal output guarantees a logical correlation of at most 0, and one can then improve the logical correlation by also having similar traces. If the outputs have string distance 1, then the smallest logical correlation can be only 0.5.
Whenever people have written/talked about ECL, a common thing I’ve read/heard was that “of course, this depends on us finding some way of saying that one decision algorithm is similar/dissimilar to another one, since we’re not going to encounter the case of perfect copies very often”. This was at least the case when I last asked Oesterheld about this, but I haven’t read Treutlein 2023 closely enough yet to figure out whether he has a satisfying solution.
The fact we didn’t have a characterization of logical correlation bugged me and was in the back of my mind, since it felt like a problem that one could make progress on. Today in the shower I was thinking about this, and the post above is what came of it.
(I also have the suspicion that having a notion of “these two programs produce the same/similar outputs in a similar way” might be handy in general.)
My reason for caring about internal computational states is: In the twin prisoners dilemma[1], I cooperate because we’re the same algorithm. If we modify the twin to have a slightly longer right index-finger-nail, I would still cooperate, even though they’re a different algorithm, but little enough has been changed about the algorithm that the internal states that they’re still similar enough.
But it could be that I’m in a prisoner’s dilemma with some program p⋆ that, given some inputs, returns the same outputs as I do, but for completely different “reasons”—that is, the internal states are very different, and a slight change in input would cause the output to be radically different. My logical correlation with p⋆ is pretty small, because, even though it gives the same output, it gives that output for very different reasons, so I don’t have much control over its outputs by controlling my own computations.
Consider proposing the most naïve formula for logical correlation[1].
Let a program p be a tuple of code for a Turing machine, intermediate tape states after each command execution, and output. All in binary.
That is p=(c,t,o), with c∈{0,1}+,t∈({0,1}+)+ and o∈{0,1}+.
Let l=|t| be the number of steps that p takes to halt.
Then a formula for the logical correlation 合 [2] of two halting programs p1=(c1,t1,o1),p2=(c2,t2,o2), a tape-state discount factor γ[3], and a string-distance metric d:{0,1}+×{0,1}+→N could be
合 (p1,p2,γ)=d(o1,o2)−12+∑min(l1,l2)k=0γk⋅d(t1(l1−k),t2(l2−k))
The lower 合 , the higher the logical correlation between p1 and p2. The minimal value is −0.5.
If d(o1,o2)<d(o1,o3), then it’s also the case that 合 (p1,p2,γ)<合 (p1,p3,γ).
One might also want to be able to deal with the fact that programs have different trace lengths, and penalize that, e.g. amending the formula:
合 ′(p1,p2,γ)=合 (p1,p2,γ)+2|l1−l2|
I’m a bit unhappy that the code doesn’t factor in the logical correlation, and ideally one would want to be able to compute the logical correlation without having to run the program.
How does this relate to data=code?
Actually not explained in detail anywhere, as far as I can tell. I’m going to leave out all motivation here.
Suggested by GPT-4. Stands for joining, combining, uniting. Also “to suit; to fit”, “to have sexual intercourse”, “to fight, to have a confrontation with”, or “to be equivalent to, to add up”.
Which is needed because tape states close to the output are more important than tape states early on.
What do you want to use the logical correlation measure for?
I don’t have a concrete usage for it yet.
I think this isn’t possible in the general case. Consider two programs, one of which is “compute the sha256 digest of all 30 byte sequences and halt if the result is
9a56f6b41455314ff1973c72046b0821a56ca879e9d95628d390f8b560a4d803
” and the other of which is “compute the md5 digest of all 30 byte sequences and halt if the result is055a787f2fb4d00c9faf4dd34a233320
”.Any method that was able to compute the logical correlation between those would also be a program which at a minimum reverses all cryptograhic hash functions.
How close would this rank a program p with a universal Turing machine simulating p? My sense is not very close because the “same” computation steps on each program don’t align.
My “most naïve formula for logical correlation” would be something like put a probability distribution on binary string inputs, treat p1 and p2 as random variables {0,1}∗→{0,1}∗∪{⊥}, and compute their mutual information.
The strongest logical correlation is −0.5, the lower the better.
For p and sim(p), the logical correlation would be −ε, assuming that p and sim(p) have the same output. This is a pretty strong logical correlation.
This is because equal output guarantees a logical correlation of at most 0, and one can then improve the logical correlation by also having similar traces. If the outputs have string distance 1, then the smallest logical correlation can be only 0.5.
Could you say more about the motivation here ?
Whenever people have written/talked about ECL, a common thing I’ve read/heard was that “of course, this depends on us finding some way of saying that one decision algorithm is similar/dissimilar to another one, since we’re not going to encounter the case of perfect copies very often”. This was at least the case when I last asked Oesterheld about this, but I haven’t read Treutlein 2023 closely enough yet to figure out whether he has a satisfying solution.
The fact we didn’t have a characterization of logical correlation bugged me and was in the back of my mind, since it felt like a problem that one could make progress on. Today in the shower I was thinking about this, and the post above is what came of it.
(I also have the suspicion that having a notion of “these two programs produce the same/similar outputs in a similar way” might be handy in general.)
If you want to use it for ECL, then it’s not clear to me why internal computational states would matter.
My reason for caring about internal computational states is: In the twin prisoners dilemma[1], I cooperate because we’re the same algorithm. If we modify the twin to have a slightly longer right index-finger-nail, I would still cooperate, even though they’re a different algorithm, but little enough has been changed about the algorithm that the internal states that they’re still similar enough.
But it could be that I’m in a prisoner’s dilemma with some program p⋆ that, given some inputs, returns the same outputs as I do, but for completely different “reasons”—that is, the internal states are very different, and a slight change in input would cause the output to be radically different. My logical correlation with p⋆ is pretty small, because, even though it gives the same output, it gives that output for very different reasons, so I don’t have much control over its outputs by controlling my own computations.
At least, that’s how I understand it.
Is this actually ECL, or just acausal trade?