Nathan Helm-Burger comments on The case for more Alignment Target Analysis (ATA)

Nathan Helm-Burger 26 Sep 2024 16:54 UTC
7 points
0
Relevant quote from Zvi: https://www.lesswrong.com/posts/FeqY7NWcFMn8haWCR/ai-83-the-mask-comes-off

Even if you do get to align the ASI you need to decide what you want it to value.
```
Roon: “human values” are not real nor are they nearly enough. asi must be divinely omnibenevolent to be at all acceptable on this planet.

in other words COHERENT EXTRAPOLATED VOLITION

This has stirred some controversy … “human values” are not real insofar as californian universalism isn’t universal and people very much disagree about what is right and just and true even in your own neighborhood.
```
It is not enough to give asi some known set of values and say just apply this. there is no cultural complex on earth that deserves to be elevated to a permanent stranglehold. if this is all there is we fall woefully short of utopia. “”″

I continue to think that CEV won’t work, in the sense that even if you did it successfully and got an answer, I would not endorse that answer on reflection and I would not be happy with the results. I expect it to be worse than (for example) asking Roon to write something down as best he could—I’ll take a semi-fictionalized Californian Universalism over my expectation of CEV if those are the choices, although of course I would prefer my own values to that. I think people optimistic about CEV have a quite poor model of the average human. I do hope I am wrong about that.