the gears to ascension answers Optimizing for Agency?

the gears to ascension 14 Feb 2024 10:48 UTC
6 points
2
This is an interesting idea that is being explored, but how do you nail it down precisely so that the superintelligence is actually interested in optimizing for it, and so that the beings whose agency is being optimized for are actually the ones you’re interested in preserving? Identifying the agents in a chunk of matter is not a solved problem. Eg, here’s a rough sketch of the challenge I see, posed as a question to a hypothetical future LLM (I know of no LLM capable of helping significantly with this, GPT4 and Gemini Advanced have both been insufficient. I’m hopeful the causal incentives group hits another home run like Discovering Agents and nails it down.)

Meanwhile, the folks who have been discussing boundaries seem to maybe possibly be onto something about defining a zone of agency, maybe. I’m not totally sure they have anything to add on top of Discovering Agents.

Cannell has also talked about “empowerment of other”—Empowerment is the term of art for what you’re proposing here.

It always comes down to the difficulty of making sure the superintelligence’s agency is actually seeking agency for others, rather than a facsimile of agency for others that turns out to just be pictures of agency.
- Gunnar_Zarncke 6 Dec 2024 16:46 UTC
  2 points
  0
  Parent
  Cannell has also talked about “empowerment of other”
  Do you mean this? Empowerment is (almost) All We Need
  folks who have been discussing boundaries … zone of agency
  and this: Agent membranes/boundaries and formalizing “safety”
  - the gears to ascension 6 Dec 2024 21:04 UTC
    4 points
    0
    Parent
    Yes to both. I don’t think Cannell is correct about an implementation of what he said being a good idea, even if it was a certified implementation, and I also don’t think his idea is close to ready to implement. Agent membranes still seem at all interesting, right now as far as I know the most interesting work is coming from the Levin lab (tufts university, michael levin), but I’m not happy with any of it for nailing down what we mean by aligning an arbitrarily powerful mind to care about the actual beings in its environment in a strongly durable way.
    - Gunnar_Zarncke 7 Dec 2024 14:18 UTC
      2 points
      0
      Parent
      I’m not clear about what research by Michael Levin you mean. I found him mentioned here: «Boundaries», Part 3b: Alignment problems in terms of boundaries but his research seems to be about cellular computation, not related to alignment.
      - the gears to ascension 8 Dec 2024 12:20 UTC
        4 points
        0
        Parent
        https://www.drmichaellevin.org/research/
        
        https://www.drmichaellevin.org/publications/
        
        it’s not directly on alignment, but it’s relevant to understanding agent membranes. understanding his work seems useful as a strong exemplar of what one needs to describe with a formal theory of agents and such. particularly interesting is https://pubmed.ncbi.nlm.nih.gov/31920779/
        
        It’s not the result we’re looking for, but it’s inspiring in useful ways.
- Michael Soareverix 15 Feb 2024 6:40 UTC
  1 point
  0
  Parent
  Super interesting!
  There’s a lot of information here that will be super helpful for me to delve into. I’ve been bookmarking your links.
  I think optimizing for the empowerment of other agents is a better target than giving the AI all the agency and hoping that it creates agency for people as a side-effect to maximizing something else. I’m glad to see there’s lots of research happening on this and I’ll be checking out ‘empowerment’ as an agency term.
  Agency doesn’t equal ‘goodness’, but it seems like an easier target to hit. I’m trying to break down the alignment problem into slices to figure it out and agency seems like a key slice.
  - the gears to ascension 15 Feb 2024 21:23 UTC
    2 points
    0
    Parent
    the problem is there are going to be self-agency-maximizing ais at some point and the question is how to make AIs that can defend the agency of humans against those.
  - Tristan Tran 30 May 2024 23:50 UTC
    1 point
    0
    Parent
    With optimization, I’m always concerned with the interactions of multiple agents, are there any ways in this system that two or more agents could form cartels and increase each others agency. I see this happen with some reinforcement learning models where if some edge cases aren’t covered, then they will just mine each other for easy points thanks to how we set up the reward function.