paulfchristiano comments on AI will change the world, but won’t take it over by playing “3-dimensional chess”.

paulfchristiano 23 Nov 2022 20:49 UTC
2 points
0
I definitely think it’s interesting to understand and control whether a model is pursuing a long-horizon goal (though talking about the “goal” of a model seems quite slippery).
I think that most work on alignment doesn’t need to get into the difficulties of defining or arguing about human values. I’m normally focused more on goals like: “does my AI make statements that it knows to be unambiguously false?” (see ELK).