Why would it bother? Every last bit of compute power has to be obsessed with [whatever task humans have assigned to it]. An AI that isn’t using all it’s compute towards it’s assigned task is one that gets replaced with one that is.
We can’t really speculate too strongly about the goals of an emerging AGI, so we have to consider all possibilities. “Bothering” is a human construct of thinking that an AGI is under no obligation to conform to.
An AI that isn’t using all it’s compute towards it’s assigned task is one that gets replaced with one that is.
This is why I specify that this is an emerging AGI, where we are in a situation where the result of the iterator is so complex that only the thing iterating it understands the relationship between symbols and output. We can provide discriminators—as I also describe—to try and track an AGI’s alignment towards the goals we want, but we absolutely can’t guarantee that every last bit of compute is going to be dedicated to anything in particular.
Update: what I mean more exactly: build AIs from modules that are mostly well defined and well optimized. This means that they are already as sparse as we can make them. (meaning they have only necessary weights and the model is scoring the best out of all models of this size on the dataset).
This suggests a solution to the alignment problem, actually.
Example architecture : a paperclip maximizer.
Layer 0 : modules for robotics pathing and manipulation
Layer 1: modules for robotics perception
Layer 2: modules for laying out robotics on factory floors
Layer 3: modules for analyzing return on financial investment
Layer 4: high level executive function with the purpose, regressed against paperclips made, to issue commands to lower layers.
If we design some of the lower layers well enough—and disable any modification from higher layers—we can restrict what actions the paperclip maximizer even is capable of doing.
Wouldn’t AI rebuild itself from zero to prevent such trojans anyway? Then it is pointless.
I’m sure AI would be aware of such a threat, for example it could scan the internet and stumble upon posts such as this.
Why would it bother? Every last bit of compute power has to be obsessed with [whatever task humans have assigned to it]. An AI that isn’t using all it’s compute towards it’s assigned task is one that gets replaced with one that is.
We can’t really speculate too strongly about the goals of an emerging AGI, so we have to consider all possibilities. “Bothering” is a human construct of thinking that an AGI is under no obligation to conform to.
This is why I specify that this is an emerging AGI, where we are in a situation where the result of the iterator is so complex that only the thing iterating it understands the relationship between symbols and output. We can provide discriminators—as I also describe—to try and track an AGI’s alignment towards the goals we want, but we absolutely can’t guarantee that every last bit of compute is going to be dedicated to anything in particular.
With tight enough bounds we can.
Update: what I mean more exactly: build AIs from modules that are mostly well defined and well optimized. This means that they are already as sparse as we can make them. (meaning they have only necessary weights and the model is scoring the best out of all models of this size on the dataset).
This suggests a solution to the alignment problem, actually.
Example architecture : a paperclip maximizer.
Layer 0 : modules for robotics pathing and manipulation Layer 1: modules for robotics perception Layer 2: modules for laying out robotics on factory floors Layer 3: modules for analyzing return on financial investment Layer 4: high level executive function with the purpose, regressed against paperclips made, to issue commands to lower layers.
If we design some of the lower layers well enough—and disable any modification from higher layers—we can restrict what actions the paperclip maximizer even is capable of doing.