Daniel Kokotajlo comments on Comments on Carlsmith’s “Is power-seeking AI an existential risk?”

Daniel Kokotajlo 14 Jan 2023 22:01 UTC
LW: 8 AF: 6
AF
I think this is an excellent response (I’d even say, companion piece) to Joe Carlsmith’s also-excellent report on the risk from power-seeking AI. On a brief re-skim I think I agree with everything Nate says, though I’d also have a lot more to add and I’d shift emphasis around a bit. (Some of the same points I did in fact make in my own review of Joe’s report.)

Why is it important for there to be a response? Well, the 5% number Joe came to at the end is just way too low. Even if you disagree with me about that, you’ll concede that a big fraction of the rationalist community—including some very well-respected, knowledgeable members—thinks 5% is way too low. So it’s important for their view to be at least partially represented.

Beyond that, I think this post presents some good ideas clearly, and is worth reading in its own right even if you never read Joe’s report. I just randomly scrolled to a section of it, “Difficulty of alignment” and right off the bat there are three bullet points that are unfortunately all too plausible & worth visualizing.
What links here?
- Raemon's comment on The 2021 Review Phase by Raemon (20 Jan 2023 20:54 UTC; 8 points)
- Raemon's comment on Highlights and Prizes from the 2021 Review Phase by Raemon (27 Jan 2023 23:35 UTC; 2 points)