This way the model has consistent behavior and is unable to betray.
This is a standard swe technique for larger, more reliable systems. See stateless microservices or how ROS works.
For AI, look at all the examples where irrelevant information changes model behavior, such as the “grandma used to read me windows license keys” exploit.
I interpret “aimability” as doing what the user most likely meant and nothing else, and “aligned aimability” would mean the probability of this goal being achieved is high.
Aimability doesn’t mean reduced info
This is a standard swe technique for larger, more reliable systems. See stateless microservices or how ROS works.
For AI, look at all the examples where irrelevant information changes model behavior, such as the “grandma used to read me windows license keys” exploit.
I interpret “aimability” as doing what the user most likely meant and nothing else, and “aligned aimability” would mean the probability of this goal being achieved is high.