HoldenKarnofsky comments on A Playbook for AI Risk Reduction (focused on misaligned AI)

HoldenKarnofsky 10 Jun 2023 6:26 UTC
8 points
6
Noting that I don’t think alignment being “solved” is a binary. As discussed in the post, I think there are a number of measures that could improve our odds of getting early human-level-ish AIs to be aligned “enough,” even assuming no positive surprises on alignment science. This would imply that if lab A is more attentive to alignment and more inclined to invest heavily in even basic measures for aligning its systems than lab B, it could matter which lab develops very capable AI systems first.