I disagree. Agent AIs are harder to predict than tool AIs almost by definition—not just for us, but also for other AIs. So what an AI would want to do is create more tool AIs, and make very sure they obey it.
Yeah, okay, wrong semantics. I should have said make very sure they report their activities truthfully and are fully compliant with any instructions given at any time.
I disagree. Agent AIs are harder to predict than tool AIs almost by definition—not just for us, but also for other AIs. So what an AI would want to do is create more tool AIs, and make very sure they obey it.
But an AI could design a modification of itself that makes itself into an agent obedient to a particular goal.
That’s a very agenty thing to do...
Yeah, okay, wrong semantics. I should have said make very sure they report their activities truthfully and are fully compliant with any instructions given at any time.