Thomas Larsen comments on All AGI safety questions welcome (especially basic ones) [July 2022]

Thomas Larsen 16 Jul 2022 16:36 UTC
8 points
6
There are reasons to think that an AI is aligned between “hoping it is aligned” and “having a formal proof that it is aligned”. For example, we might be able to find sufficiently strong selection theorems, which tell us that certain types of optima tend to be chosen, even if we can’t prove theorems with certainty. We also might be able to find a working ELK strategy that gives us interpretability.
These might not be good strategies, but the statement “Therefore no AI built by current methods can be aligned” seems far too strong.