In AI safety, a tripwire is a mechanism designed to detect signs of misalignment in an advanced artificial intelligence and shut it down automatically.
In AI safety, a tripwire is a mechanism designed to detect signs of misalignment in an advanced artificial intelligence and shut it down automatically.