3. Ahh okay thanks, I have a better picture of what you mean by a basis of possibility space now. I still doubt that utility interacts nicely with this linear structure though. The utility function is linear in lotteries, but this is distinct from being linear in possibilities. Like, if I understand your idea on that step correctly, you want to find a basis of possibility-space, not lottery space. (A basis on lottery space is easy to find—just take all the trivial lotteries, i.e. those where some outcome has probability 1.) To give an example of the contrast: if the utility I get from a life with vanilla ice cream is u_1 and the utility I get from a life with chocolate ice cream is u_2, then the utility of a lottery with 50% chance of each is indeed 0.5 u_1 + 0.5 u_2. But what I think you need on that step is something different. You want to say something like “the utility of the life where I get both vanilla ice cream and chocolate ice cream is u_1+u_2”. But this still seems just morally false to me. I think the mistake you are making in the derivation you give in your comment is interpreting the numerical coefficients in front of events as both probabilities of events or lotteries and as multiplication in the linear space you propose. The former is fine and correct, but I think the latter is not fine. So in particular, when you write u(2A), in the notation of the source you link, this can only mean “the utility you get from a lottery where the probability of A is 2″, which does not make sense assuming you don’t allow your probabilities to be >1. Or even if you do allow probabilities >1, it still won’t give you what you want. In particular, if A is a life with vanilla ice cream, then in their notation, 2A does not refer to a life with twice the quantity of vanilla ice cream, or whatever.
4. I think the gradient part of the Hodge decomposition is not (in general) the same as the ranking with the minimal number of incorrect pairs. Fun stuff
more on 4: Suppose you have horribly cyclic preferences and you go to a rationality coach to fix this. In particular, your ice cream preferences are vanilla>chocolate>mint>vanilla. Roughly speaking, Hodge is the rationality coach that will tell you to consider the three types of ice cream equally good from now on, whereas Mr. Max Correct Pairs will tell you to switch one of the three preferences. Which coach is better? If you dislike breaking cycles arbitrarily, you should go with Hodge. If you think losing your preferences is worse than that, go with Max. Also, Hodge has the huge advantage of actually being done in a reasonable amount of time :)
3. I agree with your point, especially that u(chocolate ice cream and vanilla ice cream)≠u(chocolate ice cream)+u(vanilla ice cream) should be true.
But I think I can salvage my point by making a further distinction. When I write u(chocolate ice cream) I actually mean u(emb(chocolate ice cream)) where emb is a semantic embedding that takes sentences to vectors. Already at the level of the embedding we probably have emb(chocolate ice cream and vanilla ice cream)≠emb(chocolate ice cream)+emb(vanilla ice cream),
and that’s (potentially) a good thing! Because if we structure our embedding in such a way that emb(chocolate ice cream)+emb(vanilla ice cream) points to something that is actually comparable to the conjunction of the two, then our utility function can just be naively linear in the way I constructed it above, u(emb(chocolate ice cream and vanilla ice cream))=u(emb(chocolate ice cream))+u(emb(vanilla ice cream)).I belieeeeeve that this is what I wanted to gesture at when I said that we need to identify an appropriate basis in an appropriate space (i.e. where emb(chocolate ice cream and vanilla ice cream)=emb(chocolate ice cream)+emb(vanilla ice cream),and whatever else we might want out of the embedding). But I have a large amount of uncertainty around all of this.
I still disagree / am confused. If it’s indeed the case that emb(chocolate ice cream and vanilla ice cream)≠emb(chocolate ice cream)+emb(vanilla ice cream), then why would we expect u(emb(chocolate ice cream and vanilla ice cream))=u(emb(chocolate ice cream))+u(emb(vanilla ice cream))? (Also, in the second-to-last sentence of your comment, it looks like you say the former is an equality.) Furthermore, if the latter equality is true, wouldn’t it imply that the utility we get from [chocolate ice cream and vanilla ice cream] is the sum of the utility from chocolate ice cream and the utility from vanilla ice cream? Isn’t u(emb(X)) supposed to be equal to the utility of X?
My current best attempt to understand/steelman this is to accept emb(chocolate ice cream and vanilla ice cream)≠emb(chocolate ice cream)+emb(vanilla ice cream), to reject u(emb(chocolate ice cream and vanilla ice cream))=u(emb(chocolate ice cream))+u(emb(vanilla ice cream)), and to try to think of the embedding as something slightly strange. I don’t see a reason to think utility would be linear in current semantic embeddings of natural language or of a programming language, nor do I see an appealing other approach to construct such an embedding. Maybe we could figure out a correct embedding if we had access to lots of data about the agent’s preferences (possibly in addition to some semantic/physical data), but it feels like that might defeat the idea of this embedding in the context of this post as constituting a step that does not yet depend on preference data. Or alternatively, if we are fine with using preference data on this step, maybe we could find a cool embedding, but in that case, it seems very likely that it would also just give us a one-step solution to the entire problem of computing a set of rational preferences for the agent.
A separate attempt to steelman this would be to assume that we have access to a semantic embedding pretrained on preference data from a bunch of other agents, and then to tune the utilities of the basis to best fit the preferences of the agent we are currently dealing with. That seems like it a cool idea, although I’m not sure if it has strayed too far from the spirit of the original problem.
3. Ahh okay thanks, I have a better picture of what you mean by a basis of possibility space now. I still doubt that utility interacts nicely with this linear structure though. The utility function is linear in lotteries, but this is distinct from being linear in possibilities. Like, if I understand your idea on that step correctly, you want to find a basis of possibility-space, not lottery space. (A basis on lottery space is easy to find—just take all the trivial lotteries, i.e. those where some outcome has probability 1.) To give an example of the contrast: if the utility I get from a life with vanilla ice cream is u_1 and the utility I get from a life with chocolate ice cream is u_2, then the utility of a lottery with 50% chance of each is indeed 0.5 u_1 + 0.5 u_2. But what I think you need on that step is something different. You want to say something like “the utility of the life where I get both vanilla ice cream and chocolate ice cream is u_1+u_2”. But this still seems just morally false to me. I think the mistake you are making in the derivation you give in your comment is interpreting the numerical coefficients in front of events as both probabilities of events or lotteries and as multiplication in the linear space you propose. The former is fine and correct, but I think the latter is not fine. So in particular, when you write u(2A), in the notation of the source you link, this can only mean “the utility you get from a lottery where the probability of A is 2″, which does not make sense assuming you don’t allow your probabilities to be >1. Or even if you do allow probabilities >1, it still won’t give you what you want. In particular, if A is a life with vanilla ice cream, then in their notation, 2A does not refer to a life with twice the quantity of vanilla ice cream, or whatever.
4. I think the gradient part of the Hodge decomposition is not (in general) the same as the ranking with the minimal number of incorrect pairs. Fun stuff
more on 4: Suppose you have horribly cyclic preferences and you go to a rationality coach to fix this. In particular, your ice cream preferences are vanilla>chocolate>mint>vanilla. Roughly speaking, Hodge is the rationality coach that will tell you to consider the three types of ice cream equally good from now on, whereas Mr. Max Correct Pairs will tell you to switch one of the three preferences. Which coach is better? If you dislike breaking cycles arbitrarily, you should go with Hodge. If you think losing your preferences is worse than that, go with Max. Also, Hodge has the huge advantage of actually being done in a reasonable amount of time :)
Great explanation, I feel substantially less confused now. And thank you for adding two new shoulder advisors to my repertorie :D
Thank you for the thoughtful reply!
3. I agree with your point, especially that u(chocolate ice cream and vanilla ice cream)≠u(chocolate ice cream)+u(vanilla ice cream) should be true.
But I think I can salvage my point by making a further distinction. When I write u(chocolate ice cream) I actually mean u(emb(chocolate ice cream)) where emb is a semantic embedding that takes sentences to vectors. Already at the level of the embedding we probably have emb(chocolate ice cream and vanilla ice cream)≠emb(chocolate ice cream)+emb(vanilla ice cream),
and that’s (potentially) a good thing! Because if we structure our embedding in such a way that emb(chocolate ice cream)+emb(vanilla ice cream) points to something that is actually comparable to the conjunction of the two, then our utility function can just be naively linear in the way I constructed it above, u(emb(chocolate ice cream and vanilla ice cream))=u(emb(chocolate ice cream))+u(emb(vanilla ice cream)).I belieeeeeve that this is what I wanted to gesture at when I said that we need to identify an appropriate basis in an appropriate space (i.e. where emb(chocolate ice cream and vanilla ice cream)=emb(chocolate ice cream)+emb(vanilla ice cream),and whatever else we might want out of the embedding). But I have a large amount of uncertainty around all of this.
I still disagree / am confused. If it’s indeed the case that emb(chocolate ice cream and vanilla ice cream)≠emb(chocolate ice cream)+emb(vanilla ice cream), then why would we expect u(emb(chocolate ice cream and vanilla ice cream))=u(emb(chocolate ice cream))+u(emb(vanilla ice cream))? (Also, in the second-to-last sentence of your comment, it looks like you say the former is an equality.) Furthermore, if the latter equality is true, wouldn’t it imply that the utility we get from [chocolate ice cream and vanilla ice cream] is the sum of the utility from chocolate ice cream and the utility from vanilla ice cream? Isn’t u(emb(X)) supposed to be equal to the utility of X?
My current best attempt to understand/steelman this is to accept emb(chocolate ice cream and vanilla ice cream)≠emb(chocolate ice cream)+emb(vanilla ice cream), to reject u(emb(chocolate ice cream and vanilla ice cream))=u(emb(chocolate ice cream))+u(emb(vanilla ice cream)), and to try to think of the embedding as something slightly strange. I don’t see a reason to think utility would be linear in current semantic embeddings of natural language or of a programming language, nor do I see an appealing other approach to construct such an embedding. Maybe we could figure out a correct embedding if we had access to lots of data about the agent’s preferences (possibly in addition to some semantic/physical data), but it feels like that might defeat the idea of this embedding in the context of this post as constituting a step that does not yet depend on preference data. Or alternatively, if we are fine with using preference data on this step, maybe we could find a cool embedding, but in that case, it seems very likely that it would also just give us a one-step solution to the entire problem of computing a set of rational preferences for the agent.
A separate attempt to steelman this would be to assume that we have access to a semantic embedding pretrained on preference data from a bunch of other agents, and then to tune the utilities of the basis to best fit the preferences of the agent we are currently dealing with. That seems like it a cool idea, although I’m not sure if it has strayed too far from the spirit of the original problem.