habryka comments on Prisoners’ Dilemma with Costs to Modeling

habryka 20 Jun 2018 19:58 UTC
19 points
Curated, here are my thoughts:
Pros:
- Expands on some previous work in a pretty intuitive way, and might be quite relevant to long-term AI alignment
- Uses the opportunity to help illustrate a bunch of generally important concepts in decision theory and game theory
- Feels like it uses roughly the correct level of formalism for what it is trying to achieve. Being formal enough to be precise, but not so much that it becomes unreadable.
Cons:
- The post does sadly have a bunch of dependencies that aren’t fully made clear, and I would love to see a comment on the post that has references for where to learn the basic ideas used in the post (both the modal combat framework and more fundamental things such as the concepts of pure and mixed equilibria)
- I think motivating the reasons for why you explored in this direction, and how it might help with AI alignment would have probably made more people willing to put in the effort to understand this
- Scott Garrabrant 21 Jun 2018 22:11 UTC
  35 points
  Parent
  I am actually worried that because I posted it, people will think it is more relevant to AI safety than it really is. I think it is a little related, but not strongly.
  I do think it is surprising and interesting. I think it is useful for thinking about civilization and civilizational collapse and what aliens (or maybe AI or optimization daemons) might look like. My inner Andrew Critch also thinks it is more directly related to AI safety than I do. Also if I thought multipolar scenarios were more likely, I might think it is more relevant.
  Also it is made out of pieces such that thinking about it was a useful exercise. I am thinking a lot about Nash equilibria and dynamics. I think the fact that Nash equilibria are not exactly a dynamic type of object and are not easy to find is very relevant to understanding embedded agency. Also, I think that modal combat is relevant, because I think that Lobian handshakes are pointing at an important part of reasoning about oneself.
  I think it is relevant enough that it was worth doing, and such that I would be happy if someone expanded on it, but I am not planning on thinking about it much more because it does feel only tangentially related.
  That being said, many times I have explicitly thought that I was thinking about a thing that was not really related to the bigger problems I wanted to be working on, only to later see a stronger connection.
  What links here?
  - Any rebuttals of Christiano and AI Impacts on takeoff speeds? by SoerenMind (21 Apr 2019 20:39 UTC; 67 points)