Becoming capable of building such a test is essentially the entire field of AI alignment. (yes, we don’t have the ability to build such a test and that’s bad, but the difficulty lives in the territory. MIRI’s previously stated goal were specifically to become less confused)
Becoming capable of building such a test is essentially the entire field of AI alignment. (yes, we don’t have the ability to build such a test and that’s bad, but the difficulty lives in the territory. MIRI’s previously stated goal were specifically to become less confused)
Thanks for the feedback!
I’ll see if my random idea can be formalised in such a way to constitute a (hard) test of cognition which is satisfying to humans.