(3, 7) Humans want complicated things that are hard to locate mathematically (the complexity of value thesis)
I think only the first one is really deserving of the name “irrationality”. I want what I want, and if what I want is a very complicated thing that takes into account my emotions, well, so be it. Humans might be bad at getting what they want, they might be mistaken a lot of the time about what they want and constantly step on their own toes, but there’s no objective reason why they shouldn’t want that.
Still, when up against a superintelligence, I think that both value being fragile and humans being bad at getting what they want count against humans getting anything they want out of the interaction:
Superintelligences are good at getting what they want (this is really what it means to be a superintelligence)
Superintelligences will have whatever goal they have, and I don’t think that there’s any reason why this goal would be anything to do with what humans want (the orthogonality thesis; the goals that a superintelligence has are orthogonal to how good it is at achieving them)
This together adds up to a superintelligence sees humans using resources that it could be using for something else (and it would want them to use them for something else, not just what the humans are trying to do but more, because it has its own goals), and because it’s good at getting what it wants it gets those resources, which is very unfortunate for the humans.
IMO: “Oh look, undefended atoms!” (Well, not in that format. But maybe you get the picture.)
You kind of mix together two notions of irrationality:
(1-2, 4-6) Humans are bad at getting what they want (they’re instrumentally and epistemically irrational)
(3, 7) Humans want complicated things that are hard to locate mathematically (the complexity of value thesis)
I think only the first one is really deserving of the name “irrationality”. I want what I want, and if what I want is a very complicated thing that takes into account my emotions, well, so be it. Humans might be bad at getting what they want, they might be mistaken a lot of the time about what they want and constantly step on their own toes, but there’s no objective reason why they shouldn’t want that.
Still, when up against a superintelligence, I think that both value being fragile and humans being bad at getting what they want count against humans getting anything they want out of the interaction:
Superintelligences are good at getting what they want (this is really what it means to be a superintelligence)
Superintelligences will have whatever goal they have, and I don’t think that there’s any reason why this goal would be anything to do with what humans want (the orthogonality thesis; the goals that a superintelligence has are orthogonal to how good it is at achieving them)
This together adds up to a superintelligence sees humans using resources that it could be using for something else (and it would want them to use them for something else, not just what the humans are trying to do but more, because it has its own goals), and because it’s good at getting what it wants it gets those resources, which is very unfortunate for the humans.