In Pr[U] ≈ 2^-K(U) / Pr π∈ξ [U(⌈G⌉,π) ≤ U(⌈G⌉,G*)]): Shouldn’t the inequality sign be the other way around? I am assuming that we want to maximize U and not minimize U.
As currently written, a good agent G with utility function U would be better than most random policies, and therefore Prπ∈ξ[U(⌈G⌉,π)≤U(⌈G⌉,G∗)] would be close to 1 and Pr[U] therefore be rather small.
If the sign should indeed be the other way around, then a similar problem might be present in the definition of g(G|U), if you want g to be high for more agenty programs G.
after thinking about it and asking vanessa, it seems that you’re correct; thanks for noticing. the mistake comes from the fact that i express things in terms of utility functions and vanessa expresses things in terms of loss functions, and they are reversed. the post should be fixed now.
note that in the g(G|U) definition, i believe it is also ≥, because -log flips the function.
In
Pr[U] ≈ 2^-K(U) / Pr π∈ξ [U(⌈G⌉,π) ≤ U(⌈G⌉,G*)])
: Shouldn’t the inequality sign be the other way around? I am assuming that we want to maximize U and not minimize U. As currently written, a good agent G with utility function U would be better than most random policies, and therefore Prπ∈ξ[U(⌈G⌉,π)≤U(⌈G⌉,G∗)] would be close to 1 and Pr[U] therefore be rather small.If the sign should indeed be the other way around, then a similar problem might be present in the definition of g(G|U), if you want g to be high for more agenty programs G.
after thinking about it and asking vanessa, it seems that you’re correct; thanks for noticing. the mistake comes from the fact that i express things in terms of utility functions and vanessa expresses things in terms of loss functions, and they are reversed. the post should be fixed now.
note that in the
g(G|U)
definition, i believe it is also ≥, because-log
flips the function.