Principal-agent problems certainly matter! But despite that, collaboration based on extrinsic rewards (instead of selfless agreement on every detail) has been a huge success story for mankind. Is our task unusually prone to principal-agent problems, compared to other tasks? In my experience, the opposite is true: AI alignment research is unusually easy to evaluate in detail, compared to checking the work of a contractor building your house or a programmer writing code for your company.
Principal-agent problems certainly matter! But despite that, collaboration based on extrinsic rewards (instead of selfless agreement on every detail) has been a huge success story for mankind. Is our task unusually prone to principal-agent problems, compared to other tasks? In my experience, the opposite is true: AI alignment research is unusually easy to evaluate in detail, compared to checking the work of a contractor building your house or a programmer writing code for your company.