tailcalled comments on Are limited-horizon agents a good heuristic for the off-switch problem?

tailcalled 8 Jan 2022 17:41 UTC
1 point
The initial part all looks correct. However, something got lost here:
It’s kind of interesting that that popped out, because the kind of agent that performs well in an environment where it’s been trading forever, is one that sets up trades for its future self!
Because it’s true that long-term trading will give a high $L$ , but remember for myopia we might see it as optimizing $L^{*}$ , and $L^{*}$ also subtracts off $arg {max}_{m} L (m; x, x, \dots)$ . This is an issue, because the long-term trader will also increase the value of $L$ for other traders than itself, probably just as much as it does for itself, and therefore it won’t have a long-term time horizon. As a result, a pure long-term trader will actually score low on $L^{*}$ .
On the other hand, a modified version of the long-term trader which sets up “traps” that cause financial loss if it deviates from its strategy would not provide value to anyone who does not also follow its strategy, and therefore it would score high on $L^{*}$ . There are almost certainly other agents that also score high on $L^{*}$ too, though.
- redbird 9 Jan 2022 2:19 UTC
  1 point
  Parent
  the long-term trader will also increase the value of $L$ for other traders than itself, probably just as much as it does for itself
  Hmm, like what? I agree that the short-term trader s does a bit better than the long-term trader l in the l,l,… environment, because s can sacrifice the long term for immediate gain. But s does lousy in the s,s,… environment, so I think L^*(s) < L^*(l). It’s analogous to CC having higher payoff than DD in prisoner’s dilemma. (The prisoners being current and future self)
  I like the traps example, it shows that L^* is pretty weird and we’d want to think carefully before using it in practice!
  EDIT: Actually I’m not sure I follow the traps example. What’s an example of a trading strategy that “does not provide value to anyone who does not also follow its strategy”? Seems pretty hard to do! I mean, you can sell all your stock and then deliberately crash the stock market or something. Most strategies will suffer, but the strategy that shorted the market will beat you by a lot!
  - tailcalled 9 Jan 2022 15:09 UTC
    1 point
    Parent
    Hmm, like what? I agree that the short-term trader s does a bit better than the long-term trader l in the l,l,… environment, because s can sacrifice the long term for immediate gain. But s does lousy in the s,s,… environment, so I think L^*(s) < L^*(l). It’s analogous to CC having higher payoff than DD in prisoner’s dilemma. (The prisoners being current and future self)
    It’s true that $L (s; s, s, \dots)$ is low, but you have to remember to subtract off $arg {max}_{m} L (m; s, s, \dots)$ . Since every trader will do badly in the environment generated by the short-term trader, the poor performance of the short-term trader in its own environment cancels out. Essentially, $L^{*}$ asks, “To what degree can someone exploit your environment better than you can?”.
    I like the traps example, it shows that L^* is pretty weird and we’d want to think carefully before using it in practice!
    EDIT: Actually I’m not sure I follow the traps example. What’s an example of a trading strategy that “does not provide value to anyone who does not also follow its strategy”? Seems pretty hard to do! I mean, you can sell all your stock and then deliberately crash the stock market or something. Most strategies will suffer, but the strategy that shorted the market will beat you by a lot!
    If you’re limited to trading stocks, yeah, the traps example is probably very hard or impossible to pull off. What I had in mind is an AI with more options than that.