This wont work, like with all other similar schemes, because you can’t “prove” the gatekeeper down to the quark level of what makes its hardware (so you’re vulnerable to some kind of side-attack, like the memory bit flipping attack that was spoken about recently), nor shield the AI from being able to communicate through side channels (like, varying the temperature of its internal processing unit which it turns will influence the air conditioning system, …).
And that’s not even considering that the AI could actually discover new physics (new particles, …) and have some ability to manipulate them with its own hardware.
This whole class of approach can’t work, because there are just too many ways for side-attacks and side-channels of communication, and you can’t formally prove none of them are available, without going down to making proof over the whole (AI + gatekeeper + power generator + air conditioner + …) down at Schrödinger equation level.
You’re quite right—these are among the standard objections for boxing, as mentioned in the post. However, AI boxing may have value as a stopgap in an early stage, so I’m wondering about the idea’s value in that context.
This wont work, like with all other similar schemes, because you can’t “prove” the gatekeeper down to the quark level of what makes its hardware (so you’re vulnerable to some kind of side-attack, like the memory bit flipping attack that was spoken about recently), nor shield the AI from being able to communicate through side channels (like, varying the temperature of its internal processing unit which it turns will influence the air conditioning system, …).
And that’s not even considering that the AI could actually discover new physics (new particles, …) and have some ability to manipulate them with its own hardware.
This whole class of approach can’t work, because there are just too many ways for side-attacks and side-channels of communication, and you can’t formally prove none of them are available, without going down to making proof over the whole (AI + gatekeeper + power generator + air conditioner + …) down at Schrödinger equation level.
You’re quite right—these are among the standard objections for boxing, as mentioned in the post. However, AI boxing may have value as a stopgap in an early stage, so I’m wondering about the idea’s value in that context.