This was very though-provoking, but unfortunately I still think this crashes head-on with the realization that, a priori and in full generality, we can’t differentiate between safe and unsafe updates. Indeed, why would we expect that no one will punish us by updating on “our own beliefs” or “which beliefs I endorse”? After all, that’s just one more part of reality (without a clear boundary separating it).
I’m comfortable explicitly assuming this isn’t the case for nice clean decision-theoretic results, so long as it looks like the resulting decision theory also handles this possibility ‘somewhat sanely’.
It sounds like you are correctly explaining that our choice of prior will be, in some important sense, arbitrary: we can’t know the correct one in advance, we always have to rely on extrapolating contingent past observations. But then, it seems like your reaction is still hoping that we can have our cake and eat it: “I will remain uncertain about which beliefs I endorse, and only later will I update on the fact that I am in this or that reality. If I’m in the Infinite Counterlogical Mugging… then I will just eventually change my prior because I noticed I’m in the bad world!”. But then again, why would we think this update is safe? That’s just not being updateless, and losing out on the strategic gains from not updating.
My thinking is more that we should accept the offer finitely many times or some fraction of the times, so that we reap some of the gains from updatelessness while also ‘not sacrificing too much’ in particular branches.
That is: in this case at least it seems like there’s concrete reason to believe we can have some cake and eat some too.
Since a solution doesn’t exist in full generality, I think we should pivot to more concrete work related to the “content” (our particular human priors and our particular environment) instead of the “formalism”.
This content-work seems primarily aimed at discovering and navigating actual problems similar to the decision-theoretic examples I’m using in my arguments. I’m more interested in gaining insights about what sorts of AI designs humans should implement. IE, the specific decision problem I’m interested in doing work to help navigate is the tiling problem.
That is: in this case at least it seems like there’s concrete reason to believe we can have some cake and eat some too.
I disagree with this framing. Sure, if you have 5 different cakes, you can eat some and have some. But for any particular cake, you can’t do both. Similarly, if you face 5 (or infinitely many) identical decision problems, you can choose to be updateful in some of them (thus obtaining useful Value of Information, that increases your utility in some worlds), and updateless in others (thus obtaining useful strategic coherence, that increases your utility in other worlds). The fundamental dichotomy remains as sharp, and it’s misleading to imply we can surmount it. It’s great to discuss, given this dichotomy, which trade-offs we humans are more comfortable making. But I’ve felt this was obscured in many relevant conversations.
This content-work seems primarily aimed at discovering and navigating actual problems similar to the decision-theoretic examples I’m using in my arguments. I’m more interested in gaining insights about what sorts of AI designs humans should implement. IE, the specific decision problem I’m interested in doing work to help navigate is the tiling problem.
My point is that the theoretical work you are shooting for is so general that it’s closer to “what sorts of AI designs (priors and decision theories) should always be implemented”, rather than “what sorts of AI designs should humans in particular, in this particular environment, implement”. And I think we won’t gain insights on the former, because there are no general solutions, due to fundamental trade-offs (“no-free-lunchs”). I think we could gain many insights on the former, but that the methods better fit for that are less formal/theoretical and way messier/”eye-balling”/iterating.
I disagree with this framing. Sure, if you have 5 different cakes, you can eat some and have some. But for any particular cake, you can’t do both. Similarly, if you face 5 (or infinitely many) identical decision problems, you can choose to be updateful in some of them (thus obtaining useful Value of Information, that increases your utility in some worlds), and updateless in others (thus obtaining useful strategic coherence, that increases your utility in other worlds). The fundamental dichotomy remains as sharp, and it’s misleading to imply we can surmount it. It’s great to discuss, given this dichotomy, which trade-offs we humans are more comfortable making. But I’ve felt this was obscured in many relevant conversations.
I don’t get your disagreement. If your view is that you can’t eat one cake and keep it too, and my view is that you can eat some cakes and keep other cakes, isn’t the obvious conclusion that these two views are compatible?
I would also argue that you can slice up a cake and keep some slices but eat others (this corresponds to mixed strategies), but this feels like splitting hairs rather than getting at some big important thing. My view is mainly about iterated situations (more than one cake).
Maybe your disagreement would be better stated in a way that didn’t lean on the cake analogy?
My point is that the theoretical work you are shooting for is so general that it’s closer to “what sorts of AI designs (priors and decision theories) should always be implemented”, rather than “what sorts of AI designs should humans in particular, in this particular environment, implement”. And I think we won’t gain insights on the former, because there are no general solutions, due to fundamental trade-offs (“no-free-lunchs”). I think we could gain many insights on the former, but that the methods better fit for that are less formal/theoretical and way messier/”eye-balling”/iterating.
Well, one way to continue this debate would be to discuss the concrete promising-ness of the pseudo-formalisms discussed in the post. I think there are some promising-seeming directions.
Another way to continue the debate would be to discuss theoretically whether theoretical work can be useful.
It sort of seems like your point is that theoretical work always needs to be predicated on simplifying assumptions. I agree with this, but I don’t think it makes theoretical work useless. My belief is that we should continue working to make the assumptions more and more realistic, but the ‘essential picture’ is often preserved under this operation. (EG, Newtonian gravity and general relativity make most of the same predictions in practice. Kolmogorov axioms vindicated a lot of earlier work on probability theory.)
I’m comfortable explicitly assuming this isn’t the case for nice clean decision-theoretic results, so long as it looks like the resulting decision theory also handles this possibility ‘somewhat sanely’.
My thinking is more that we should accept the offer finitely many times or some fraction of the times, so that we reap some of the gains from updatelessness while also ‘not sacrificing too much’ in particular branches.
That is: in this case at least it seems like there’s concrete reason to believe we can have some cake and eat some too.
This content-work seems primarily aimed at discovering and navigating actual problems similar to the decision-theoretic examples I’m using in my arguments. I’m more interested in gaining insights about what sorts of AI designs humans should implement. IE, the specific decision problem I’m interested in doing work to help navigate is the tiling problem.
I disagree with this framing. Sure, if you have 5 different cakes, you can eat some and have some. But for any particular cake, you can’t do both. Similarly, if you face 5 (or infinitely many) identical decision problems, you can choose to be updateful in some of them (thus obtaining useful Value of Information, that increases your utility in some worlds), and updateless in others (thus obtaining useful strategic coherence, that increases your utility in other worlds). The fundamental dichotomy remains as sharp, and it’s misleading to imply we can surmount it. It’s great to discuss, given this dichotomy, which trade-offs we humans are more comfortable making. But I’ve felt this was obscured in many relevant conversations.
My point is that the theoretical work you are shooting for is so general that it’s closer to “what sorts of AI designs (priors and decision theories) should always be implemented”, rather than “what sorts of AI designs should humans in particular, in this particular environment, implement”.
And I think we won’t gain insights on the former, because there are no general solutions, due to fundamental trade-offs (“no-free-lunchs”).
I think we could gain many insights on the former, but that the methods better fit for that are less formal/theoretical and way messier/”eye-balling”/iterating.
I don’t get your disagreement. If your view is that you can’t eat one cake and keep it too, and my view is that you can eat some cakes and keep other cakes, isn’t the obvious conclusion that these two views are compatible?
I would also argue that you can slice up a cake and keep some slices but eat others (this corresponds to mixed strategies), but this feels like splitting hairs rather than getting at some big important thing. My view is mainly about iterated situations (more than one cake).
Maybe your disagreement would be better stated in a way that didn’t lean on the cake analogy?
Well, one way to continue this debate would be to discuss the concrete promising-ness of the pseudo-formalisms discussed in the post. I think there are some promising-seeming directions.
Another way to continue the debate would be to discuss theoretically whether theoretical work can be useful.
It sort of seems like your point is that theoretical work always needs to be predicated on simplifying assumptions. I agree with this, but I don’t think it makes theoretical work useless. My belief is that we should continue working to make the assumptions more and more realistic, but the ‘essential picture’ is often preserved under this operation. (EG, Newtonian gravity and general relativity make most of the same predictions in practice. Kolmogorov axioms vindicated a lot of earlier work on probability theory.)