While this is the agenda that Stuart talks most about, other work also happens at CHAI
Yes good point—I’ll clarify and link to ARCHES.
The reason I’m excited about CIRL is because it provides a formalization of assistance games in the sequential decision-making setting … There should soon be a paper that more directly explains the case for the formalism
Yeah this is a helpful perspective, and great to hear re upcoming paper. I have definitely spoken to some folks that think of CHAI as the “cooperative inverse reinforcement learning lab” so I wanted to make the point that CIRL != CHAI.
All models are wrong; some are useful
Well keep in mind that we’re using the agent model twice: once in our own understanding of the AI systems we build, and then a second time in the AI system’s understanding of what a human is. We can update the former as needed, but if we want the AI system to be able to update its understanding of what a human is then we need to work out how to make that assumption updateable in the algorithms we deploy.
So when I hear “X is misspecified, so it might misbehave”; I want to hear more about how exactly it will misbehave before I’m convinced I should care.
Very fair request. I will hopefully be writing more on this topic in the specific case of the agent assumption soon.
More generally, it seems like “help X” or “assist X” only means something when you view X as pursuing some goal
Well would you agree that it’s possible to help a country? A country seems pretty far away from being an agent, although perhaps it could be said to have goals. Yet it does seem possible to provide e.g. economic advice or military assistance to a country in a way that helps country without simply helping each of the separate individuals.
How about helping some primitive organism, such as a jellyfish or amoeba? I guess you could impute goals onto such organisms...
How about helping a tree? It actually seems pretty straightforward to me how to help a tree (bring water and nutrients to it, clean off parasites from the bark, cut away any dead branches), but does an individual tree really have goals?
Now that I’ve read your post on optimization, I’d restate
More generally, it seems like “help X” or “assist X” only means something when you view X as pursuing some goal.
as
More generally, it seems like “help X” or “assist X” only means something when you view X as an optimizing system.
Which I guess was your point in the first place, that we should view things as optimizing systems and not agents. (Whereas when I hear “agent” I usually think of something like what you call an “optimizing system”.)
I think my main point is that “CHAI’s agenda depends strongly on an agent assumption” seems only true of the specific mathematical formalization that currently exists; I would not be surprised if the work could then be generalized to optimizing systems instead of agents / EU maximizers in particular.
I think my main point is that “CHAI’s agenda depends strongly on an agent assumption” seems only true of the specific mathematical formalization that currently exists; I would not be surprised if the work could then be generalized to optimizing systems instead of agents / EU maximizers in particular.
Ah, very interesting, yeah I agree this seems plausible, and also this is very encouraging to me!
In all of the “help X” examples you give, I do feel like it’s reasonable to do it via taking an intentional stance towards X, e.g. a tree by default takes in water + nutrients through its roots and produces fruit and seeds, in a way that wouldn’t happen “randomly”, and so “helping a tree” means “causing the tree to succeed more at taking in water + nutrients and producing fruit + seeds”.
In the case of a country, I think I would more say “whatever the goal of a country, since the country knows how to use money / military power, that will likely help with its goal, since money + power are instrumental subgoals”. This is mostly a shortcut; ideally I’d figure out what the country’s “goal” is and then assist with that, but that’s very difficult to do because a country is very complex.
Yes good point—I’ll clarify and link to ARCHES.
Yeah this is a helpful perspective, and great to hear re upcoming paper. I have definitely spoken to some folks that think of CHAI as the “cooperative inverse reinforcement learning lab” so I wanted to make the point that CIRL != CHAI.
Well keep in mind that we’re using the agent model twice: once in our own understanding of the AI systems we build, and then a second time in the AI system’s understanding of what a human is. We can update the former as needed, but if we want the AI system to be able to update its understanding of what a human is then we need to work out how to make that assumption updateable in the algorithms we deploy.
Very fair request. I will hopefully be writing more on this topic in the specific case of the agent assumption soon.
Well would you agree that it’s possible to help a country? A country seems pretty far away from being an agent, although perhaps it could be said to have goals. Yet it does seem possible to provide e.g. economic advice or military assistance to a country in a way that helps country without simply helping each of the separate individuals.
How about helping some primitive organism, such as a jellyfish or amoeba? I guess you could impute goals onto such organisms...
How about helping a tree? It actually seems pretty straightforward to me how to help a tree (bring water and nutrients to it, clean off parasites from the bark, cut away any dead branches), but does an individual tree really have goals?
Now that I’ve read your post on optimization, I’d restate
as
Which I guess was your point in the first place, that we should view things as optimizing systems and not agents. (Whereas when I hear “agent” I usually think of something like what you call an “optimizing system”.)
I think my main point is that “CHAI’s agenda depends strongly on an agent assumption” seems only true of the specific mathematical formalization that currently exists; I would not be surprised if the work could then be generalized to optimizing systems instead of agents / EU maximizers in particular.
Ah, very interesting, yeah I agree this seems plausible, and also this is very encouraging to me!
In all of the “help X” examples you give, I do feel like it’s reasonable to do it via taking an intentional stance towards X, e.g. a tree by default takes in water + nutrients through its roots and produces fruit and seeds, in a way that wouldn’t happen “randomly”, and so “helping a tree” means “causing the tree to succeed more at taking in water + nutrients and producing fruit + seeds”.
In the case of a country, I think I would more say “whatever the goal of a country, since the country knows how to use money / military power, that will likely help with its goal, since money + power are instrumental subgoals”. This is mostly a shortcut; ideally I’d figure out what the country’s “goal” is and then assist with that, but that’s very difficult to do because a country is very complex.