In this post (original here), Paul Christiano analyzes the ambitious value learning approach.
I find it a little bit confusing that Rohin’s note refers to the “ambitious value learning approach”, while the title of the post refers to the “easy goal inference problem”. I think the note could benefit from clarifying the relationship of these two descriptors.
As it stands, I’m asking myself—are they disagreeing about whether this is easy or hard? Or is “ambitious value learning” the same as “goal inference” (such that there’s no disagreement, and in Rohin’s terminology this would be the “easy version of ambitious value learning”)? Or something else?
The easy goal inference problem is the same thing as ambitious value learning under the assumption of infinite compute and data about human behavior (which is the assumption that we’re considering for most of this sequence).
The previous post was meant to outline the problem, all subsequent posts are about that problem. Ambitious value learning is probably the best name for the problem now, but not all posts use the same terminology even though they’re talking about approximately the same thing.
Yeah, I said “goal inference” instead of “value learning” but I mean the same thing. The “ambitious” part is that we are trying to do much better than humans, which I was taking for granted in this post (it’s six months older than ambitious vs. narrow value learning).
I find it a little bit confusing that Rohin’s note refers to the “ambitious value learning approach”, while the title of the post refers to the “easy goal inference problem”. I think the note could benefit from clarifying the relationship of these two descriptors.
As it stands, I’m asking myself—are they disagreeing about whether this is easy or hard? Or is “ambitious value learning” the same as “goal inference” (such that there’s no disagreement, and in Rohin’s terminology this would be the “easy version of ambitious value learning”)? Or something else?
The easy goal inference problem is the same thing as ambitious value learning under the assumption of infinite compute and data about human behavior (which is the assumption that we’re considering for most of this sequence).
The previous post was meant to outline the problem, all subsequent posts are about that problem. Ambitious value learning is probably the best name for the problem now, but not all posts use the same terminology even though they’re talking about approximately the same thing.
Yeah, I said “goal inference” instead of “value learning” but I mean the same thing. The “ambitious” part is that we are trying to do much better than humans, which I was taking for granted in this post (it’s six months older than ambitious vs. narrow value learning).