A highly compressed version of what the disagreements are about in my ontology of disagreements about AI safety...
crux about continuity; here GA mostly has the intuition “things will be discontinuous” and this manifests in many guesses (phase shifts, new ways of representing data, possibility to demonstrate overpowering the overseer, …); Paul assumes things will be mostly continuous, with a few exceptions which may be dangerous
this seems similar to typical cruxes between Paul and e.g. Eliezer (also in my view this is actually decent chunk of disagreements: my model of Eliezer predicts Eliezer would actually update toward more optimistic views if he believed “we will have more tries to solve the actual problems, and they will show in a lab setting”)
possible crux about x-risk from the broader system (e.g. AI powered cultural evolution); here it’s unclear who is exactly where in this debate
I don’t think there is any neat public debate on this, but I usually disagree with Eliezer’s and similar “orthodox” views about the relative difficulty & expected neglectedness (I expect narrow single ML system “alignment” to be difficult but solvable and likely solved by default, because incentives to do so; whole-world-alignment / multi-multi to be difficult and with bad results by default)
A highly compressed version of what the disagreements are about in my ontology of disagreements about AI safety...
crux about continuity; here GA mostly has the intuition “things will be discontinuous” and this manifests in many guesses (phase shifts, new ways of representing data, possibility to demonstrate overpowering the overseer, …); Paul assumes things will be mostly continuous, with a few exceptions which may be dangerous
this seems similar to typical cruxes between Paul and e.g. Eliezer (also in my view this is actually decent chunk of disagreements: my model of Eliezer predicts Eliezer would actually update toward more optimistic views if he believed “we will have more tries to solve the actual problems, and they will show in a lab setting”)
possible crux about x-risk from the broader system (e.g. AI powered cultural evolution); here it’s unclear who is exactly where in this debate
I don’t think there is any neat public debate on this, but I usually disagree with Eliezer’s and similar “orthodox” views about the relative difficulty & expected neglectedness (I expect narrow single ML system “alignment” to be difficult but solvable and likely solved by default, because incentives to do so; whole-world-alignment / multi-multi to be difficult and with bad results by default)
(there are also many points of agreement)