Does anyone know of even very simple examples of AIs exhibiting instrumentally convergent resource aquisition?
Something like “an AI system in a video game learns to seek out the power ups, because that helps it win.” (Even better would be a version in which, you can give the agent one of several distinct-video game goals, but regardless of the goal, it goes and gets the powerups first).
It needs to be an example where the instrumental resource is not strictly required for succeeding at the task, while still being extremely helpful.
I haven’t looked into this in detail but I would be quite surprised if Voyager didn’t do any of that?
Although I’m not sure whether what you’re asking for is exactly what you’re looking for. It seems straightforward that if you train/fine-tune a model on examples of people playing a game that involves leveraging [very helpful but not strictly necessary] resources, you are going to get an AI capable of that.
It would be more non-trivial if you got an RL agent doing that, especially if it didn’t stumble into that strategy/association “I need to do X, so let me get Y first” by accident but rather figured that Y tends to be helpful for X via some chain of associations.
[For some of my work for Palisade]
Does anyone know of even very simple examples of AIs exhibiting instrumentally convergent resource aquisition?
Something like “an AI system in a video game learns to seek out the power ups, because that helps it win.” (Even better would be a version in which, you can give the agent one of several distinct-video game goals, but regardless of the goal, it goes and gets the powerups first).
It needs to be an example where the instrumental resource is not strictly required for succeeding at the task, while still being extremely helpful.
I haven’t looked into this in detail but I would be quite surprised if Voyager didn’t do any of that?
Although I’m not sure whether what you’re asking for is exactly what you’re looking for. It seems straightforward that if you train/fine-tune a model on examples of people playing a game that involves leveraging [very helpful but not strictly necessary] resources, you are going to get an AI capable of that.
It would be more non-trivial if you got an RL agent doing that, especially if it didn’t stumble into that strategy/association “I need to do X, so let me get Y first” by accident but rather figured that Y tends to be helpful for X via some chain of associations.