It seems to me that at least some parts of this research agenda are relevant for some special cases of “the failure mode of an amoral AI system that doesn’t care about you”.
I still wouldn’t recommend working on those parts, because they seem decidedly less impactful than other options. But as written it does sound like I’m claiming that the agenda is totally useless for anything besides s-risks, which I certainly don’t believe. I’ve changed that second paragraph to:
However, under other ethical systems (under which s-risks are worse than x-risks, but do not completely dwarf x-risks), I expect other technical safety research to be more impactful, because other approaches can more directly target the failure mode of an amoral AI system that doesn’t care about you, which seems both more likely and more amenable to technical safety approaches (to me at least). I could imagine work on this agenda being quite important for _strategy_ research, though I am far from an expert here.
I still wouldn’t recommend working on those parts, because they seem decidedly less impactful than other options. But as written it does sound like I’m claiming that the agenda is totally useless for anything besides s-risks, which I certainly don’t believe. I’ve changed that second paragraph to: