Not really, because it takes time to train the cognitive skills necessary for deception.
You might expect this if your AGI was built with a “capabilities module” and a “goal module” and the capabilities were already present before putting in the goal, but it doesn’t seem like AGI is likely to be built this way.
Not really, because it takes time to train the cognitive skills necessary for deception.
Would that not be the case with *any* form of deceptive alignment, though? Surely it (deceptive alignment) wouldn’t pose a risk at all if that were the case? Sorry in advance for my stupidity.
Not really, because it takes time to train the cognitive skills necessary for deception.
You might expect this if your AGI was built with a “capabilities module” and a “goal module” and the capabilities were already present before putting in the goal, but it doesn’t seem like AGI is likely to be built this way.
Would that not be the case with *any* form of deceptive alignment, though? Surely it (deceptive alignment) wouldn’t pose a risk at all if that were the case? Sorry in advance for my stupidity.