A paperclip maximizer decides for itself how to maximize paperclips; it can ignore human instructions. This SRS network can’t: It receives instructions and updates and deterministically follows them. Hence the question around secure communication between SRS and colonies: a paperclip maximizer doesn’t need that.
What is your distinction between “self-generated” evidence and evidence I can update anthropic reasoning on?
A paperclip maximizer decides for itself how to maximize paperclips; it can ignore human instructions. This SRS network can’t: It receives instructions and updates and deterministically follows them. Hence the question around secure communication between SRS and colonies: a paperclip maximizer doesn’t need that.
What is your distinction between “self-generated” evidence and evidence I can update anthropic reasoning on?