Steven Byrnes comments on The Pointer Resolution Problem

Steven Byrnes 18 Feb 2024 21:16 UTC
3 points
0
If you have possible targets $T_{1}, \dots, T_{N}$ , then as an alternative to maximin, you could also maximize $log (T_{1}) + \dots + log (T_{N})$ , or replace log with your favorite function with sufficiently-sharply-diminishing returns. This has the advantage that the AI will continue to take pareto-improvements rather than often feeling completely neutral about them.
(That’s just a fun idea that I think I got indirectly from Stuart Armstrong … but I’m skeptical that this would really be relevant: I suspect that if you have any uncertainty about the target at all, it would rapidly turn into a combinatorial explosion of possibilities, such that it would be infeasible for the AI to keep track of how an action scores on all gazillion of them.)
- Jeremy Gillen 20 Feb 2024 0:12 UTC
  1 point
  0
  Parent
  Geometric rationality ftw!
  (In normal planning problems there are exponentially many plans to evaluate (in the number of actions). So that doesn’t seem to be a major obstacle if your agent is already capable of planning.)