There’s a lot of information here that will be super helpful for me to delve into. I’ve been bookmarking your links.
I think optimizing for the empowerment of other agents is a better target than giving the AI all the agency and hoping that it creates agency for people as a side-effect to maximizing something else. I’m glad to see there’s lots of research happening on this and I’ll be checking out ‘empowerment’ as an agency term.
Agency doesn’t equal ‘goodness’, but it seems like an easier target to hit. I’m trying to break down the alignment problem into slices to figure it out and agency seems like a key slice.
the problem is there are going to be self-agency-maximizing ais at some point and the question is how to make AIs that can defend the agency of humans against those.
With optimization, I’m always concerned with the interactions of multiple agents, are there any ways in this system that two or more agents could form cartels and increase each others agency. I see this happen with some reinforcement learning models where if some edge cases aren’t covered, then they will just mine each other for easy points thanks to how we set up the reward function.
Super interesting!
There’s a lot of information here that will be super helpful for me to delve into. I’ve been bookmarking your links.
I think optimizing for the empowerment of other agents is a better target than giving the AI all the agency and hoping that it creates agency for people as a side-effect to maximizing something else. I’m glad to see there’s lots of research happening on this and I’ll be checking out ‘empowerment’ as an agency term.
Agency doesn’t equal ‘goodness’, but it seems like an easier target to hit. I’m trying to break down the alignment problem into slices to figure it out and agency seems like a key slice.
the problem is there are going to be self-agency-maximizing ais at some point and the question is how to make AIs that can defend the agency of humans against those.
With optimization, I’m always concerned with the interactions of multiple agents, are there any ways in this system that two or more agents could form cartels and increase each others agency. I see this happen with some reinforcement learning models where if some edge cases aren’t covered, then they will just mine each other for easy points thanks to how we set up the reward function.