I completely agree with your point: an agent being updateless doesn’t mean it won’t learn new information. In fact, it might perfectly decide to “make my future action A depend on future information X”, if the updateless prior so finds it optimal. While in other situations, when the updateless prior deems it net-negative (maybe due to other agents exploiting this future dependence), it won’t.
This point is already observed in the post (see e.g. footnote 4), although without going deep into it, due to the post being meant for the layman (it is more deeply addressed, for example, in section 4.4 of my report). Also for illustrative purposes, in two places I have (maybe unfairly) caricaturized an updateless agent as being “scared” of learning more information. While really, what this means (as hopefully clear from earlier parts of the post) is “the updateless prior assessed whether it seemed net-positive to let future actions depend on future information, and decided no (for almost all actions)”.
The problem I present is not “being scared of information”, but the trade-off between “letting your future action depend on future information X” vs “not doing so” (and, in more detail, how exactly it should depend on such information). More dependence allows you to correctly best-respond in some situations, but also could sometimes get you exploited. The problem is there’s no universal (belief-independent) rule to assess when to allow for dependence: different updateless priors will decide differently. And need to do so in advance of letting their deliberation depend on their interactions (they still don’t know if that’s net-positive). Due to this prior-dependence, if different updateless agents have different beliefs, they might play very different policies, and miscoordinate. This is also analogous to different agents demanding different notions of fairness (more here). I have read no convincing arguments as to why most superintelligences will converge on beliefs (or notions of fairness) that successfully coordinate on Pareto optimality (especially in the face of the problem of trapped priors i.e. commitment races), and would be grateful if you could point me in their direction.
I interpret you as expressing a strong normative intuition in favor of ex ante optimization. I share this primitive intuition, and indeed it remains true that, if you have some prior and simply want to maximize its EV, updatelessness is exactly what you need. But I think we have discovered other pro tanto reasons against updatelessness, like updateless agents probably performing worse on average (in complex environments) due to trapped priors and increased miscoordination.
Thank you for engaging, Eliezer.
I completely agree with your point: an agent being updateless doesn’t mean it won’t learn new information. In fact, it might perfectly decide to “make my future action A depend on future information X”, if the updateless prior so finds it optimal. While in other situations, when the updateless prior deems it net-negative (maybe due to other agents exploiting this future dependence), it won’t.
This point is already observed in the post (see e.g. footnote 4), although without going deep into it, due to the post being meant for the layman (it is more deeply addressed, for example, in section 4.4 of my report). Also for illustrative purposes, in two places I have (maybe unfairly) caricaturized an updateless agent as being “scared” of learning more information. While really, what this means (as hopefully clear from earlier parts of the post) is “the updateless prior assessed whether it seemed net-positive to let future actions depend on future information, and decided no (for almost all actions)”.
The problem I present is not “being scared of information”, but the trade-off between “letting your future action depend on future information X” vs “not doing so” (and, in more detail, how exactly it should depend on such information). More dependence allows you to correctly best-respond in some situations, but also could sometimes get you exploited. The problem is there’s no universal (belief-independent) rule to assess when to allow for dependence: different updateless priors will decide differently. And need to do so in advance of letting their deliberation depend on their interactions (they still don’t know if that’s net-positive).
Due to this prior-dependence, if different updateless agents have different beliefs, they might play very different policies, and miscoordinate. This is also analogous to different agents demanding different notions of fairness (more here). I have read no convincing arguments as to why most superintelligences will converge on beliefs (or notions of fairness) that successfully coordinate on Pareto optimality (especially in the face of the problem of trapped priors i.e. commitment races), and would be grateful if you could point me in their direction.
I interpret you as expressing a strong normative intuition in favor of ex ante optimization. I share this primitive intuition, and indeed it remains true that, if you have some prior and simply want to maximize its EV, updatelessness is exactly what you need. But I think we have discovered other pro tanto reasons against updatelessness, like updateless agents probably performing worse on average (in complex environments) due to trapped priors and increased miscoordination.