We shouldn’t consider problem “how to persuade putative Oracle to stay inside the box?” solved. If we just take powerful optimizer and tell it to optimize for truth of particular formula, in addition to self-fulfilling prophecies we can get simple old instrumental convergence where AI gather knowledge and computing resources to give the most correct possivle answer.
I have a distaste for design decisions that impair the cognitive abilities of AI, because they are unnatural and just begging to be broken. I prefer weird utility functions to weird cognitions.
Some scattered thoughts on the topic:
We shouldn’t consider problem “how to persuade putative Oracle to stay inside the box?” solved. If we just take powerful optimizer and tell it to optimize for truth of particular formula, in addition to self-fulfilling prophecies we can get simple old instrumental convergence where AI gather knowledge and computing resources to give the most correct possivle answer.
I have a distaste for design decisions that impair the cognitive abilities of AI, because they are unnatural and just begging to be broken. I prefer weird utility functions to weird cognitions.