Yeah right, that is scarier. Looking forward to reading your argument, esp re why we would expect deceptive agents that score well to outnumber aligned agents that score well.
Although in the same sense we could say that a rock “contains” many deceptive agents, since if we viewed the rock as a giant mixture of computations then we would surely find some that implement deceptive agents.
Yeah right, that is scarier. Looking forward to reading your argument, esp re why we would expect deceptive agents that score well to outnumber aligned agents that score well.
Although in the same sense we could say that a rock “contains” many deceptive agents, since if we viewed the rock as a giant mixture of computations then we would surely find some that implement deceptive agents.