I am interested in the economics of collective action problems, and in identifying paths to advancing technology that are ethical, safe and practical.
https://twitter.com/JonasMtzgr
Jonas Metzger
Karma: 4
Yeah, I already edited out some verbosity. ChatGPT is just trained to hedge too much currently. Should I take out more?
It seems to have distracted a bit from the purpose of the post: that we can define an unobjectionable way to aggregate utilities and have an LLM follow it, while still being useful for its owner.
Just to clarify, the complete equilibrium strategy alluded to here is:
”Play 99 and, if anyone deviates from any part of the strategy, play 100 to punish them until they give in”
Importantly, this includes deviations from the punishment. If you don’t join the punishment, you’ll get punished. That makes it rational to play 99 and punish deviators.
The point of the Folk Theorems are that the Nash Equilibrium notion has limited predictive power in repeated games like this, because essentially any payoff could be implemented as a similar Nash equilibrium. That doesn’t mean it has no predictive power—if everyone in that room keeps playing that strategy, you’d play it too, eventually. That’s a valid prediction.
What seems wrong about the described equilibrium is that you would expect players to negotiate away from it. If they can communicate somehow (it’s hard to formalize, but you could even conceive of this communication as happening through their dials), they would likely arrange a Nash equilibrium around different temperature. Without communication, it seems unlikely that a complex Nash equilibrium like above would even arise, as players would likely start out by repeating the so-called stage game, where they just set the dial to whatever temperature they want (how would they even know they will get punished?).
In fact, the notion of renegotiation actually draws certain punishments in question too (e.g. punish forever after one deviation), which means they are less credible threats, ones that rational players would arguably not be afraid of testing out. This can be formalized as a so-called equilibrium refinement, which makes stronger predictions than the classic Nash equilibrium notion, about the types of strategies rational players may employ, see e.g. the seminal Farrell and Maskin (1989).