MIRI doesn’t have good reasons to support the claim of almost certain doom
I recently asked Eliezer why he didn’t suspect ELK to be helpful, and it seemed that one of his major reasons was that Paul was “wrongly” excited about IDA. It seems that at this point in time, neither Paul nor Eliezer are excited about IDA, but Eliezer got to the conclusion first. Although, the IDA-bearishness may be for fundamentally different reasons—I haven’t tried to figure that out yet.
Have you been taking this into account re: your ELK bullishness? Obviously, this sort of point should be ignored in favor of object-level arguments about ELK, but to be honest, ELK is taking me a while to digest, so for me that has to wait.
It seems that at this point in time, neither Paul nor Eliezer are excited about IDA
I’m still excited about IDA.
I assume this is coming from me saying that you need big additional conceptual progress to have an indefinitely scalable scheme. And I do think that’s more skeptical than my strongest pro-IDA claim here in early 2017:
I think there is a very good chance, perhaps as high as 50%, that this basic strategy can eventually be used to train benign state-of-the-art model-free RL agents. [...] That does not mean that I think the conceptual issues are worked out conclusively, but it does mean that I think we’re at the point where we’d benefit from empirical information about what works in practice
That said:
I think it’s up for grabs whether we’ll end up with something that counts as “this basic strategy.” (I think imitative generalization is the kind of thing I had in mind in that sentence, but many of the ELK schemes we are thinking about definitely aren’t, it’s pretty arbitrary.)
Also note that in that post I’m talking about something that produces a benign agent in practice, and in the other I’m talking about “indefinitely scalable.” Though my probability on “produces a benign agent in practice” is also definitely lower.
Did Eliezer give any details about what exactly was wrong about Paul’s excitement? Might just be an intuition gained from years of experience, but the more details we know the better, I think.
Eliezer has an opaque intuition that weird recursion is hard to get right on the first try. I want to interview him and write this up, but I don’t know if I’m capable of asking the right questions. Probably someone should do it.
Eliezer thinks people tend to be too optimistic in general
I’ve heard other people have an intuition that IDA is unaligned because HCH is unaligned because real human bureaucracies are unaligned
I’ll add that when I asked John Wentworth why he was IDA-bearish, he mentioned the inefficiency of bureaucracies and told me to read the following post to learn why interfaces and coordination are hard: Interfaces as a Scarce Resource.
I recently asked Eliezer why he didn’t suspect ELK to be helpful, and it seemed that one of his major reasons was that Paul was “wrongly” excited about IDA. It seems that at this point in time, neither Paul nor Eliezer are excited about IDA, but Eliezer got to the conclusion first. Although, the IDA-bearishness may be for fundamentally different reasons—I haven’t tried to figure that out yet.
Have you been taking this into account re: your ELK bullishness? Obviously, this sort of point should be ignored in favor of object-level arguments about ELK, but to be honest, ELK is taking me a while to digest, so for me that has to wait.
I’m still excited about IDA.
I assume this is coming from me saying that you need big additional conceptual progress to have an indefinitely scalable scheme. And I do think that’s more skeptical than my strongest pro-IDA claim here in early 2017:
That said:
I think it’s up for grabs whether we’ll end up with something that counts as “this basic strategy.” (I think imitative generalization is the kind of thing I had in mind in that sentence, but many of the ELK schemes we are thinking about definitely aren’t, it’s pretty arbitrary.)
Also note that in that post I’m talking about something that produces a benign agent in practice, and in the other I’m talking about “indefinitely scalable.” Though my probability on “produces a benign agent in practice” is also definitely lower.
Did Eliezer give any details about what exactly was wrong about Paul’s excitement? Might just be an intuition gained from years of experience, but the more details we know the better, I think.
Some scattered thoughts in this direction:
this post
Eliezer has an opaque intuition that weird recursion is hard to get right on the first try. I want to interview him and write this up, but I don’t know if I’m capable of asking the right questions. Probably someone should do it.
Eliezer thinks people tend to be too optimistic in general
I’ve heard other people have an intuition that IDA is unaligned because HCH is unaligned because real human bureaucracies are unaligned
I found this comment where Eliezer has detailed criticism of Paul’s alignment agenda including finding problems with “weird recursion”
I’ll add that when I asked John Wentworth why he was IDA-bearish, he mentioned the inefficiency of bureaucracies and told me to read the following post to learn why interfaces and coordination are hard: Interfaces as a Scarce Resource.