It seems to me like the core problem here is that basic RL doesn’t distinguish between environments and agents in environments—it doesn’t have separate ways of reasoning about rain being associated with clouds and water balloons being associated with Calvin. Does it seem to you like there’s something deeper going on?
Why should you treat agents in a special way? It doesn’t seem like “agent” is a natural kind, everything is just atoms, and you should probably treat it that way.
I think the failures here are:
Bad decision theory, not taking into account acausal logical consequences of our decisions.
Lack of foresight, not considering that particular behaviors will ultimately lead to extortion (by letting others know you are the kind of person who can be extorted)
Failing to recognize those logical or long-term consequences, despite using an algorithm that would respond appropriately if it recognized them.
It seems like humans get a lot of use out of concepts like “agent” and “extortion” even though in principle functional decision theory is simpler. Functional decision theory may just never be computationally tractable outside of radically simplified toy problems.
It seems to me like the core problem here is that basic RL doesn’t distinguish between environments and agents in environments—it doesn’t have separate ways of reasoning about rain being associated with clouds and water balloons being associated with Calvin. Does it seem to you like there’s something deeper going on?
Why should you treat agents in a special way? It doesn’t seem like “agent” is a natural kind, everything is just atoms, and you should probably treat it that way.
I think the failures here are:
Bad decision theory, not taking into account acausal logical consequences of our decisions.
Lack of foresight, not considering that particular behaviors will ultimately lead to extortion (by letting others know you are the kind of person who can be extorted)
Failing to recognize those logical or long-term consequences, despite using an algorithm that would respond appropriately if it recognized them.
It seems like humans get a lot of use out of concepts like “agent” and “extortion” even though in principle functional decision theory is simpler. Functional decision theory may just never be computationally tractable outside of radically simplified toy problems.