I’m working my way through your `Research Agenda v0.9’ post, and am therefore going through various older posts to understand things. I wonder if I could ask some questions about the definition you propose here?
First, that X be contained in RN for some N seems not so relevant; can I just assume X, Y and Z are some manifolds (Ck for some 0≤k≤∞)? And we are given some partial order ≺ on X, so that we can refer to `being a better world’?
Then, as I understand it, your definition says the following:
Fix X, ≺ and Z. Let Y be a manifold and y+, y−∈Y. Given a local homomorphism +:Y×Z→X, we say that y+ is partially preferred to y− if for all z∈Z, we have y−+z≺y++z.
I’m not sure which inequalities should be strict, but this seems non-essential for now. On the other hand, the dependence of this definition on the choice of Y seems somewhat subtle and interesting. I will try to illustrate this in what follows.
First, let us make a new definition. Fix X, ≺, and Z as before. Let Y′={y+,y−}, a two-element set equipped with the discrete topology, and let +′:Y×Z→X be an immersion of Ck-manifolds. We say that y+ is weakly partially preferred to y− if for all z∈Z, we have y−+′z≺y++′z.
First, it is clear that partial preference implies weak partial preference. More formally:
Claim 1: Fix X, ≺ and Z. Suppose we have a manifold Y, points y+, y−∈Y, and a local homomorphism +:Y×Z→X such that y+ is partially preferred to y−. Setting Y′={y+,y−} with the subspace topology from Y (i.e. discrete), and taking +′ to be the restriction of + from Y×Z to Y′×Z, we have that y+ is weakly partially preferred to y−.
Proof: obvious. $\qed$
However, the converse can fail if Z is not contractible. First, let’s prove that the concepts are equivalent for Z contractible:
Claim 2: Fix X, ≺ and Z, and assume that Z is contractible. Suppose we have a two-element set Y′={y+,y−} and a map +′:Y′×Z→X making y+ weakly partially preferred to y−. Then there exist a manifold Y, an injection Y′→Y, and a local homeomorphism +:Y×Z→X whose restriction to Y′×Z is +′, making y+ partially preferred to y−.
Proof: Let’s assume for simplicity of notation that X is equidimensional, say of dimension dX, and write dZ for the dimension of Z. Let Y be the disjoint union of two open balls of dimension dX−dZ, with Y′→Y the inclusion of the centres of the balls. Then take an ϵ-neighbourhood of Z in X; it is diffeomorphic to Y×Z since the normal bundle to Z in X is trivialisable (c.f. https://math.stackexchange.com/questions/857784/product-neighborhood-theorem-with-boundary). $\qed$
If we want examples where weak partial preference and partial preference don’t coincide, we should look for an example where Z is not contractible, and its normal bundle in X is not contractible.
Example 3: Let X be the disjoint union of two moebius bands, and let Z be a circle. Note that including Z along the centre of either band gives a submanifold whose tubular neighbourhood is not a product. Assume that ≺ is such that one component of X is preferred to the other (and ≺ is indifferent within each connected component). Then take Y′={y+,y−}, and +′:Y′×Z→X to be the inclusion of the two circles along the centres of the two moebius bands, such that {y+}×Z ends up in the preferred band. This yields a situation where y+ is weakly partially preferred to y−, but the conclusion of Claim 2 fails, i.e. this cannot be extended to a partial preference for y+ over y−.
What conclusion should we draw from this? To me, it suggests that the notion of partial preference is not yet quite as one would want. In the setting of Example 3, where X consists of two moebius strips, one of which is preferred to the other, then landing in the preferred strip should be preferred to landing in the un-preferred strip?! And yet the `local homeomorphism from a product’ condition gets in the way. This example is obviously quite artificial, and maybe analogous things cannot occur in reality. But I’m not so happy with this as an answer, since our approaches to AI safety should be (so far as possible) robust against the flaws in our understanding of physics.
Apologies for the overly-long comment, and for the imperfect LaTeX (I’ve not used this type of form much before).
Because of other problems with this, I’ve replaced it with the much more general concept of a preorder. This can express all the things we want to express, but is a lot less intuitive for how humans model things. I may come up with some alternative definition at some point (less general than a preorder, but more general than this post.
Hi Stuart,
I’m working my way through your `Research Agenda v0.9’ post, and am therefore going through various older posts to understand things. I wonder if I could ask some questions about the definition you propose here?
First, that X be contained in RN for some N seems not so relevant; can I just assume X, Y and Z are some manifolds (Ck for some 0≤k≤∞)? And we are given some partial order ≺ on X, so that we can refer to `being a better world’?
Then, as I understand it, your definition says the following:
Fix X, ≺ and Z. Let Y be a manifold and y+, y−∈Y. Given a local homomorphism +:Y×Z→X, we say that y+ is partially preferred to y− if for all z∈Z, we have y−+z≺y++z.
I’m not sure which inequalities should be strict, but this seems non-essential for now. On the other hand, the dependence of this definition on the choice of Y seems somewhat subtle and interesting. I will try to illustrate this in what follows.
First, let us make a new definition. Fix X, ≺, and Z as before. Let Y′={y+,y−}, a two-element set equipped with the discrete topology, and let +′:Y×Z→X be an immersion of Ck-manifolds. We say that y+ is weakly partially preferred to y− if for all z∈Z, we have y−+′z≺y++′z.
First, it is clear that partial preference implies weak partial preference. More formally:
Claim 1: Fix X, ≺ and Z. Suppose we have a manifold Y, points y+, y−∈Y, and a local homomorphism +:Y×Z→X such that y+ is partially preferred to y−. Setting Y′={y+,y−} with the subspace topology from Y (i.e. discrete), and taking +′ to be the restriction of + from Y×Z to Y′×Z, we have that y+ is weakly partially preferred to y−.
Proof: obvious. $\qed$
However, the converse can fail if Z is not contractible. First, let’s prove that the concepts are equivalent for Z contractible:
Claim 2: Fix X, ≺ and Z, and assume that Z is contractible. Suppose we have a two-element set Y′={y+,y−} and a map +′:Y′×Z→X making y+ weakly partially preferred to y−. Then there exist a manifold Y, an injection Y′→Y, and a local homeomorphism +:Y×Z→X whose restriction to Y′×Z is +′, making y+ partially preferred to y−.
Proof: Let’s assume for simplicity of notation that X is equidimensional, say of dimension dX, and write dZ for the dimension of Z. Let Y be the disjoint union of two open balls of dimension dX−dZ, with Y′→Y the inclusion of the centres of the balls. Then take an ϵ-neighbourhood of Z in X; it is diffeomorphic to Y×Z since the normal bundle to Z in X is trivialisable (c.f. https://math.stackexchange.com/questions/857784/product-neighborhood-theorem-with-boundary). $\qed$
If we want examples where weak partial preference and partial preference don’t coincide, we should look for an example where Z is not contractible, and its normal bundle in X is not contractible.
Example 3: Let X be the disjoint union of two moebius bands, and let Z be a circle. Note that including Z along the centre of either band gives a submanifold whose tubular neighbourhood is not a product. Assume that ≺ is such that one component of X is preferred to the other (and ≺ is indifferent within each connected component). Then take Y′={y+,y−}, and +′:Y′×Z→X to be the inclusion of the two circles along the centres of the two moebius bands, such that {y+}×Z ends up in the preferred band. This yields a situation where y+ is weakly partially preferred to y−, but the conclusion of Claim 2 fails, i.e. this cannot be extended to a partial preference for y+ over y−.
What conclusion should we draw from this? To me, it suggests that the notion of partial preference is not yet quite as one would want. In the setting of Example 3, where X consists of two moebius strips, one of which is preferred to the other, then landing in the preferred strip should be preferred to landing in the un-preferred strip?! And yet the `local homeomorphism from a product’ condition gets in the way. This example is obviously quite artificial, and maybe analogous things cannot occur in reality. But I’m not so happy with this as an answer, since our approaches to AI safety should be (so far as possible) robust against the flaws in our understanding of physics.
Apologies for the overly-long comment, and for the imperfect LaTeX (I’ve not used this type of form much before).
Hey there! Thanks for your long comment—but, alas, this model of partial preferences is obsolete :-(
See: https://www.lesswrong.com/posts/BydQtwfN97pFwEWtW/toy-model-piece-1-partial-preferences-revisited
Because of other problems with this, I’ve replaced it with the much more general concept of a preorder. This can express all the things we want to express, but is a lot less intuitive for how humans model things. I may come up with some alternative definition at some point (less general than a preorder, but more general than this post.
Thanks for the comment in any case.
Never mind—I had fun thinking about this :-).