gjm comments on Related Discussion from Thomas Kwa’s MIRI Research Experience

gjm 10 Oct 2023 23:05 UTC
5 points
8
I’m struck by this:
There is some MIRI/Nate/Eliezer frame of the alignment problem that basically no one else has.
This might be true, and if true it might be very important. But, outside view, I think the track record of people/organizations claiming things along the lines of “we and we alone have the correct understanding of X, and your only way to understand X is to seek our wisdom” is pretty bad, and that of people/organizations about whom other people say “they and they alone have the correct understanding, etc.” isn’t much better.
I know that MIRI expresses concern about the dangers of spreading their understanding of things that might possibly be used to advance AI capabilities. But if an important thing they have is a uniquely insightful way of framing the alignment problem then that seems like the sort of thing that (1) is very unlikely to be dangerous to reveal, (2) could be very valuable to share with others, and (3) if so shared would (a) encourage others to take MIRI more seriously, if indeed it turns out that they have uniquely insightful ways of thinking about alignment and (b) provide opportunities to correct errors they’re missing, if in fact what they have is (something like) plausible rhetoric that doesn’t stand up to close critical examination.
- Max H 11 Oct 2023 4:03 UTC
  22 points
  18
  Parent
  I think the 2021 MIRI Conversations and 2022 MIRI Alignment Discussion sequences are an attempt at this. I feel like I have a relatively good handle on their frame after reading those sequences, and I think the ideas contained within are pretty insightful.
  Like Zvi, I might be confused about how confused I am, but I don’t think it’s because they’re trying to keep their views secret. Maybe there’s some more specific capabilities-adjacent stuff they’re not sharing, but I suspect the thing the grandparent is getting at is more about a communication difficulty that in practice seems to be overcome mostly by working together directly, as opposed to the interpretation that they’re deliberately not communicating their basic views for secrecy reasons.
  (I also found Eliezer’s fiction helpful for internalizing his worldview in general, and IMO it is also has some pretty unique insights.)