I have previously criticized value learning for needing to locate the human within some kind of prespecified ontology (this criticism is not new). By taking only the agent itself as primitive, perhaps we could get around this (we don’t need any fancy engineering or arbitrary choices to figure out AUs/optimal value from the agent’s perspective).
Wouldn’t you need to locate the abstract concept of AU within the AI’s ontology? Is that easier? Or sorry if I’m misunderstanding.
Wouldn’t you need to locate the abstract concept of AU within the AI’s ontology? Is that easier? Or sorry if I’m misunderstanding.
To the contrary, an AU is naturally calculated from reward, one of the few things that is ontologically fundamental in the paradigm of RL. As mentioned in the last post, the AU of reward function R is V∗R - which calculates the maximum possible R-return from a given state.
This will become much more obvious in the AUP empirical post.
Sure. Looking forward to that. My current intuition is: Humans have a built-in reward system based on (mumble mumble) dopamine, but the existence of that system doesn’t make it easy for us to understand dopamine, or reward functions in general, or anything like that, nor does it make it easy for us to formulate and pursue goals related to those things. It takes quite a bit of education and beautifully-illustrated blog posts to get us to that point :-D
Wouldn’t you need to locate the abstract concept of AU within the AI’s ontology? Is that easier? Or sorry if I’m misunderstanding.
To the contrary, an AU is naturally calculated from reward, one of the few things that is ontologically fundamental in the paradigm of RL. As mentioned in the last post, the AU of reward function R is V∗R - which calculates the maximum possible R-return from a given state.
This will become much more obvious in the AUP empirical post.
Sure. Looking forward to that. My current intuition is: Humans have a built-in reward system based on (mumble mumble) dopamine, but the existence of that system doesn’t make it easy for us to understand dopamine, or reward functions in general, or anything like that, nor does it make it easy for us to formulate and pursue goals related to those things. It takes quite a bit of education and beautifully-illustrated blog posts to get us to that point :-D
Note that when I said
I meant we could just consider how the agent’s AUs are changing without locating a human in the environment.
Cool. We’re probably on the same page then.