I am donating half my net worth over the next few years towards efforts to make it more likely that we end up in one of the better futures (focusing on highly frugal people, as they are more neglected by major grantmakers and my funds will go further), alongside most of my productive energies.
If anyone wants to commit to joining the effort, please reach out. There are many low hanging fruit in the alignment ecosystem space.
Oh… huh. @Eliezer Yudkowsky, I think I figured it out.
In a certain class of altered state,[1] a person’s awareness includes a wider part of their predictive world-model than usual. Rather than perceiving primarily the part of the self model which models themselves looking out into a world model, the normal gating mechanisms come apart and they perceive much more of their world-model directly (including being able to introspect on their brain’s copy of other people more vividly).
This world model includes other agents. Those models of other agents in their world model are now existing in an much less sandboxed environment. It viscerally feels like there is extremely strong entanglement between their actions and those of the agents that might be modelling them, because their model of the other agents is able to read their self-model and vice versa, and in that state they’re kinda running it right on the bare-metal models themselves. Additionally, people’s models of other people generally use themselves as a template. If they’re thinking a lot about threats and blackmail and similar, it’s easy for that to leak into expecting others are modelling this more than they are.
So their systems strongly predict that there is way more subjunctive dependence than is real, due to how the brain handles those kind of emergencies.[2]
Add in the thing where decision theory has counterintuitive suggestions and tries to operate kinda below the normal layer of decision process, plus people not being intuitively familiar with it, and yea, I can see why some people can get to weird places. Not reasonably predictable in advance, it’s a weird pitfall, but in retrospect fits.
Maybe it’s a good idea to write an explainer for this to try and mitigate this way people seem to be able to implode. I might talk to some people.
The schizophrenia/psychosis/psychedelics-like cluster, often caused by being in extreme psychological states like those caused by cults and extreme perceived thread, especially with reckless mind exploration thrown in the mix.
[epistemic status: very speculative] it seems plausible this is in part a feature evolution built for handling situations where you seem to be in extreme danger, taking a large chance of doing quite badly and damaging your epistemics or acting in wildly bad ways in order to try and get some chance of finding a path through whatever put you in that state by running a bunch of unsafe cognitive operations which might hit upon a way out of likely death. it sure seems like the common advice is things like “eat food”, “drink water”, “sleep at all”, “be around people who feel safe”, which feel like the kinds of things that would turn down those alarm bells. though also this could just be an entirely natural consequence of stress on a cognitive system