Value-symmetry: “Will AI systems in the critical period be equally useful for different values?”
This could fail if, for example, we can build AI systems that are very good at optimizing for easy-to-measure values but significantly worse at optimizing for hard to measure values. It might be easy to build a sovereign AI to maximize the profit of a company, but hard to create one that cares about humans and what they want.
Evan Hubinger has some operationalizations of things like this here and here.
Value-symmetry: “Will AI systems in the critical period be equally useful for different values?”
This could fail if, for example, we can build AI systems that are very good at optimizing for easy-to-measure values but significantly worse at optimizing for hard to measure values. It might be easy to build a sovereign AI to maximize the profit of a company, but hard to create one that cares about humans and what they want.
Evan Hubinger has some operationalizations of things like this here and here.