In terms of tone it seems considerably less bad. I definitely like it more than the other one because it seems to make arguments rather than give social cues. It might be improved by adding links giving technical descriptions about the terms you use (e.g. inner alignment (Hubinger’s paper), IRL (maybe a Russell paper on CIRL)). I still don’t think it would work, simply because I would guess Hassabis gets a lot of email from randos who are confused and the email doesn’t seem to distinguish you from that (this may be totally unfair to you, and I’m not saying it’s correct or not, it’s just what I expect to happen). I also feel nervous about talking about arms races like that, enforcing a narrative where they’re not only real but the default (this is an awkward thing to think because it sounds like I’m trying to manage Hassabis’s social environment deceptively, and usually I would think that worrying about “reinforcing narratives” isn’t a main thing to worry about and instead one should just say what one thinks, but, still my instincts say to worry about that here, which might be incorrect).
In terms of tone it seems considerably less bad. I definitely like it more than the other one because it seems to make arguments rather than give social cues. It might be improved by adding links giving technical descriptions about the terms you use (e.g. inner alignment (Hubinger’s paper), IRL (maybe a Russell paper on CIRL)). I still don’t think it would work, simply because I would guess Hassabis gets a lot of email from randos who are confused and the email doesn’t seem to distinguish you from that (this may be totally unfair to you, and I’m not saying it’s correct or not, it’s just what I expect to happen). I also feel nervous about talking about arms races like that, enforcing a narrative where they’re not only real but the default (this is an awkward thing to think because it sounds like I’m trying to manage Hassabis’s social environment deceptively, and usually I would think that worrying about “reinforcing narratives” isn’t a main thing to worry about and instead one should just say what one thinks, but, still my instincts say to worry about that here, which might be incorrect).