Max H comments on Why do we care about agency for alignment?

Max H 24 Apr 2023 13:48 UTC
1 point
0
you claim that Stockfish could best be understood by using the concept of ”agency”. I don’t see how.
I didn’t claim that Stockfish was best understood by using the concept of agency. I claimed agency was one useful model. Consider the first diagram in this post on embedded agency: you can regard Stockfish as Alexi playing a chess video game. By modeling Stockfish as an agent in this situation, you can abstract its internal workings somewhat and predict that it will beat you at chess, even if you can’t predict the exact moves it makes, or why it makes those moves.

> I thought that when people spoke about ”agency” and AI, they meant something like ”a capacity to set their own final goals”

I think this is not what is meant by agency. Do I have the capacity to set (i.e. change) my own final goals? Maybe so, but I sure don’t want to. My final goals are probably complex and I may be uncertain about what they are, but one day I might be able to extrapolate them into something concrete and coherent.

I would say that an agent is something which is usefully modeled as having any kind of goals at all, regardless of what those goals are. As the ability of a system to achieve those goals increases, the model as an agent gets more useful and predictive relative to other models based on (for example) the system’s internal workings. If I want to know whether Kasparov is going to beat me at chess, a detailed model of neuroscience and a scan of his brain is less useful than modeling him as an agent whose goal is to win the game. Similarly, for Mu Zero, a mechanistic understanding of the artificial neural networks it is comprised of is probably less useful than modeling it as a capable Go agent, when predicting what moves it will make.