For some time, I have planned to make a post calling for more people to actually try to solve the problem of alignment. I haven’t studied Stuart’s work in detail (something to be rectified soon), I always say June Ku’s metaethical.ai is the most advanced scheme we have, but as an unapologetic fan of CEV, this talk of value extrapolation seems on the right track. I do wonder to what extent a solution to alignment for autonomous superhuman AI can lead (in advance) to spinoff for narrower and less powerful systems—superhuman alignment seems to require a determination of the full “human utility function”, or something similar; I suppose the extrapolation part might be relevant for lesser AI, even if the full set of human values are not—but we shall learn more as Stuart’s scheme unfolds.
I will add that I am personally interested in contributing to this kind of research (paid work would be most empowering, but absent that, I will still keep doing what I can, when I can, until we run out of time), but my circumstances are a little unusual, and might be incompatible with what some organizations require. So for now I’ll just mention my interest.
For some time, I have planned to make a post calling for more people to actually try to solve the problem of alignment. I haven’t studied Stuart’s work in detail (something to be rectified soon), I always say June Ku’s metaethical.ai is the most advanced scheme we have, but as an unapologetic fan of CEV, this talk of value extrapolation seems on the right track. I do wonder to what extent a solution to alignment for autonomous superhuman AI can lead (in advance) to spinoff for narrower and less powerful systems—superhuman alignment seems to require a determination of the full “human utility function”, or something similar; I suppose the extrapolation part might be relevant for lesser AI, even if the full set of human values are not—but we shall learn more as Stuart’s scheme unfolds.
I will add that I am personally interested in contributing to this kind of research (paid work would be most empowering, but absent that, I will still keep doing what I can, when I can, until we run out of time), but my circumstances are a little unusual, and might be incompatible with what some organizations require. So for now I’ll just mention my interest.
Thanks. Would you want to send me a message explaining your interest and your unusual circumstances (if relevant)?