I’m especially interested in the analogy between AI alignment and democracy. (I guess this goes under “Social Structures and Institutions”.) Democracy is supposed to align a superhuman entity with the will of the people, but there are a lot of failures, closely analogous to well-known AI alignment issues:
politicians optimize for the approval of low-information voters, rather than truly optimizing the people’s wellbeing (deceptive alignment)
politician, pacs, parties, permanent bureaucrats are agents with their own goals that don’t align with the populace (mesa optimizers)
I think it’s more likely that insights will transfer from the field of AI alignment to the field of government design than vice versa. Easier to do experiments on the AI side, and clearer thinkers.
I’m especially interested in the analogy between AI alignment and democracy.
This is indeed a productive analogy. Sadly, on this forum, this analogy is used in 99% of the cases to generate AI alignment failure mode stories, whereas I am much more interested in using it to generate useful ideas about AI safety mechanisms.
You may be interested in my recent paper ‘demanding and designing’, just announced here, where I show how to do the useful idea generating thing. I transfer some insights about aligning powerful governments and companies to the problem of aligning powerful AI.
I’m especially interested in the analogy between AI alignment and democracy. (I guess this goes under “Social Structures and Institutions”.) Democracy is supposed to align a superhuman entity with the will of the people, but there are a lot of failures, closely analogous to well-known AI alignment issues:
politicians optimize for the approval of low-information voters, rather than truly optimizing the people’s wellbeing (deceptive alignment)
politician, pacs, parties, permanent bureaucrats are agents with their own goals that don’t align with the populace (mesa optimizers)
I think it’s more likely that insights will transfer from the field of AI alignment to the field of government design than vice versa. Easier to do experiments on the AI side, and clearer thinkers.
This is indeed a productive analogy. Sadly, on this forum, this analogy is used in 99% of the cases to generate AI alignment failure mode stories, whereas I am much more interested in using it to generate useful ideas about AI safety mechanisms.
You may be interested in my recent paper ‘demanding and designing’, just announced here, where I show how to do the useful idea generating thing. I transfer some insights about aligning powerful governments and companies to the problem of aligning powerful AI.