If the language model has common sense, we could set it up with a prompt like: “Do the good thing. Don’t do the bad thing.” and then add a smarter AI that would optimize for whatever the language model approves of.
...and then the Earth would get converted to SolidGoldMagikarp.
If the language model has common sense, we could set it up with a prompt like: “Do the good thing. Don’t do the bad thing.” and then add a smarter AI that would optimize for whatever the language model approves of.
...and then the Earth would get converted to SolidGoldMagikarp.