delton137 comments on What would we do if alignment were futile?

delton137 14 Nov 2021 22:58 UTC
1 point
Also… alignment is obviously continuum and of course 100% alignment with all human values is impossible.

A different thing you could prove is whether it’s possible to guarantee human control over an AI system as it becomes more intelligent.

There’s also a concern that a slightly unaligned system may become more and more aligned as its intelligence is scaled up (either by humans re-building/trianing it with more parameters/hardware or via recursive self-improvement). It would useful if someone could prove whether that is impossible to prevent.

I need to think about this more and read Yampolsky’s paper to really understand what would be the most useful to prove is possible or impossible.