Also, the programmers of GPT have described the activation function itself as fairly simple, using a Gaussian Error Linear Unit. The function itself is what you are positing is now the learning component after training ends, right?
EDIT: I see what you mean about it trying to use the internet itself as a memory prosthetic, by writing things that get online and may find their way into the training set of the next GPT. I suppose a GPT’s hypothetical dangerous goal might be to make the training data more predictable so that its output will be more accurate in the next version of itself.
Also, the programmers of GPT have described the activation function itself as fairly simple, using a Gaussian Error Linear Unit. The function itself is what you are positing is now the learning component after training ends, right?
EDIT: I see what you mean about it trying to use the internet itself as a memory prosthetic, by writing things that get online and may find their way into the training set of the next GPT. I suppose a GPT’s hypothetical dangerous goal might be to make the training data more predictable so that its output will be more accurate in the next version of itself.