I disagree to an extent. The examples provided seem to me to be examples of “being stupid” which agents generally have an incentive to do something about, unless they’re too stupid for that to occur to them. That doesn’t mean that their underling values will drift towards a basin of attraction.
The corrigibility thing is a basin of attraction specifically because a corrigible agent has preferences over itself and it’s future preferences. Humans do that too sometimes, but the examples provided are not that.
In general, I think you should expect dynamic preferences (cycles, attractors, chaos, etc...) anytime an agent has preferences over it’s own future preferences, and the capability to modify it’s preferences.
That’s great and all, but with all due respect:
Fuck. That. Noise.
Regardless of the odds of success and what the optimal course of action actually is, I would be very hard pressed to say that I’m trying to “help humanity die with dignity”. Regardless of what the optimal action should be given that goal, on an emotional level, it’s tantamount to giving up.
Before even getting into the cost/benefit of that attitude, in the worlds where we do make it out alive, I don’t want to look back and see a version of me where that became my goal. I also don’t think that if that was my goal, that I would fight nearly as hard to achieve it. I want a catgirl volcano lair not “dignity”. So when I try to negotiate with my money brain to expend precious calories, the plan had better involve the former, not the latter. I suspect that something similar applies to others.
I don’t want to hear about genre-saviness from the defacto-founder of the community that gave us HPMOR!Harry and the Comet King after he wrote this post. Because it’s so antithetical to the attitude present in those characters and posts like this one.
I also don’t want to hear about second-order effects when, as best as I can tell, the attitude present here is likely to push people towards ineffective doomerism, rather than actually dying with dignity.
So instead, I’m gonna think carefully about my next move, come up with a plan, blast some shonen anime OSTs, and get to work. Then, amongst all the counterfactual worlds, there will be a version of me that gets to look back and know that they faced the end of the world, rose to the challenge, and came out the other end having carved utopia out of the bones of lovecraftian gods.