This post has made me realise that constitutional design is surprisingly neglected in the AI safety community.
Designing the right constitution won’t save the world by itself, but it’s a potentially easy win that could put us in a better strategic situation down the line.
Yes, I do think constitution design is neglected! I think it’s possible people think constitution changes now won’t stick around or that it won’t make any difference in the long term, but I do think based on the arguments here that even if it’s a bit diffuse you can influence AI behavior on important structural risks by changing their constitutions. It’s simple, cheap and maybe quite effective especially for failure modes that we don’t have any good shovel-ready technical interventions for.
Interesting work.
This post has made me realise that constitutional design is surprisingly neglected in the AI safety community.
Designing the right constitution won’t save the world by itself, but it’s a potentially easy win that could put us in a better strategic situation down the line.
Yes, I do think constitution design is neglected! I think it’s possible people think constitution changes now won’t stick around or that it won’t make any difference in the long term, but I do think based on the arguments here that even if it’s a bit diffuse you can influence AI behavior on important structural risks by changing their constitutions. It’s simple, cheap and maybe quite effective especially for failure modes that we don’t have any good shovel-ready technical interventions for.