This is kind of vague, but I have this sense that almost everybody doing RL and related research takes the notion of “agent” for granted, as if it’s some metaphysical primitive*, as opposed to being a (very) leaky abstraction that exists in the world models of humans. But I don’t think the average alignment researcher has much better intuitions about agency, either, to be honest, even though some spend time thinking about things like embedded agency. It’s hard to think meaningfully about the illusoriness of the Cartesian boundary when you still live 99% of your life and think 99% of your thoughts as if you were a Cartesian agent, fully “in control” of your choices, thoughts, and actions.
(*Not that “agent” couldn’t, in fact, be a metaphysical primitive, just that such “agents” are hardly “agents” in the way most people consider humans to “be agents” [and, equally importantly, other things, like thermostats and quarks, to “not be agents”].)
This is kind of vague, but I have this sense that almost everybody doing RL and related research takes the notion of “agent” for granted, as if it’s some metaphysical primitive*, as opposed to being a (very) leaky abstraction that exists in the world models of humans. But I don’t think the average alignment researcher has much better intuitions about agency, either, to be honest, even though some spend time thinking about things like embedded agency. It’s hard to think meaningfully about the illusoriness of the Cartesian boundary when you still live 99% of your life and think 99% of your thoughts as if you were a Cartesian agent, fully “in control” of your choices, thoughts, and actions.
(*Not that “agent” couldn’t, in fact, be a metaphysical primitive, just that such “agents” are hardly “agents” in the way most people consider humans to “be agents” [and, equally importantly, other things, like thermostats and quarks, to “not be agents”].)