itaibn0 comments on How dath ilan coordinates around solving alignment

itaibn0 14 Apr 2022 6:14 UTC
1 point
I was reading the story for the first quotation entitled “The discovery of x-risk from AGI”, and I noticed something around quotation that doesn’t make sense to me and I’m curious if anyone can tell what Eliezer Yudkowsky was thinking. As referenced in a previous version of this post, after the quoted scene highest Keeper commits suicide. Discussing the impact of this, EY writes,
And in dath ilan you would not set up an incentive where a leader needed to commit true suicide and destroy her own brain in order to get her political proposal taken seriously. That would be trading off a sacred thing against an unsacred thing. It would mean that only true-suicidal people became leaders. It would be terrible terrible system design.
So if anybody did deliberately destroy their own brain in attempt to increase their credibility—then obviously, the only sensible response would be to ignore that, so as not create hideous system incentives. Any sensible person would reason out that sensible response, expect it, and not try the true-suicide tactic.
The second paragraph is clearly a reference to acausal decision theory, people making a decision because how they anticipate others react to expecting that this is how they make the decision rather than the direct consequences of the decision. I’m not sure if it really makes sense, a self-indulgent reminder that nobody has knows any systematic method for producing prescriptions from acausal decision theories in cases where purportedly they differs from causal decision theory in everyday life. Still, it’s fiction, I can suspend my disbelief.
The confusing thing is that in the story the actual result of the suicide is exactly what this passage says shouldn’t be the result. It convinces the Representatives to take the proposal more seriously and implement it. This passage is just used to illustrate how shocking the suicide was, no additional considerations are described why for the reasoning is incorrect in those circumstances. So it looks like the Representatives are explicitly violating the Algorithm which supposedly underlies the entire dath ilan civilization and is taught to every child at least in broad strokes, in spite of being the second-highest ranked governing body of dath ilan.
- Slider 14 Apr 2022 8:08 UTC
  3 points
  Parent
  There is a bit of “and to assent to the proposed change, under that protocol, would be setting up the wrong system incentives in the world that was most probably the case.” which kind of implies that there might be a case where the wrong system incentives are not a serious enough downside to advice agaist it.
  It might rely a lot in that a Keeper would not frivolously suicide to no effect. I don’t know what exactly people make think that Keepers are sane but the logic how the external circumstances would point to the person being insane is supposed to put the two in tension. And because “Keepers are sane” is so much more strongly believed it makes the update of “things are real bad” go from trivial to significant.