Okay, I talked more on what conclusions we can draw from LLMs that actually generalize to superhuman AI here, so go check that out:
https://www.lesswrong.com/posts/tDkYdyJSqe3DddtK4/alexander-gietelink-oldenziel-s-shortform#mPaBbsfpwgdvoK2Z2
The really short summary is human values are less complicated and more dependent on data than people thought, and we can specify our values rather easily without it going drastically wrong:
This is not a property of LLMs, but of us.
here
is that supposed to be a link?
I rewrote the comment to put the link immediately below the first sentence.
The link is at the very bottom of the comment.
Okay, I talked more on what conclusions we can draw from LLMs that actually generalize to superhuman AI here, so go check that out:
https://www.lesswrong.com/posts/tDkYdyJSqe3DddtK4/alexander-gietelink-oldenziel-s-shortform#mPaBbsfpwgdvoK2Z2
The really short summary is human values are less complicated and more dependent on data than people thought, and we can specify our values rather easily without it going drastically wrong:
This is not a property of LLMs, but of us.
is that supposed to be a link?
I rewrote the comment to put the link immediately below the first sentence.
The link is at the very bottom of the comment.