Quite like this forecast from Andrew Critch on milestones on AI development, and my reactions:
The timeline he suggests, in ~10 years, we face choice 6a/b, which implies at least 3 possibilities:
A) we need society-level consensus (which might be force-backed) that humans can/should control agents (or digital entities, more generally) who are in all (economically/militarily) relevant aspects, superior to us. Assuming they fit within the moral circle as we currently conceive (@davidchalmers42 @jeffrsebo Thomas Metzinger and Nick Bostrom / Carl Shulman, have analysed this in various ways), and absent some novel claim about how AIs are different/lesser ethical beings, hard to see how this is essentially different from slavery, or animal cruelty, something that will presumably be obvious to any AGI worth the name; or
B) we are able to engineer AI motivations to act harmlessly/subserviently in a way that is “better” than (A), which (wild guesses) could be a form of open individualism or AI-specific conception of identity (e.g Buddhist/Hindu and some indigenous traditions have more radically inclusive conceptions of identity than the Greco-Judeo-Christian human-centred frame that currently dominates AI ethics); or
C) have some very solid person- and species-neutral grounds for why humans (and our ecosystem) are worth preserving, drafted in a way that is at least potentially reasonable in the ontology/value system of the most powerful/friendliest AIs.
To flesh out my thoughts on (C), I (reworking a 2023 LessWrong post by Miller, Häggström, Yampolskiy, Miller) write up a ‘letter to a future AGI’. I suspect this approach to (C) is fundamentally flawed: we can’t predict an AIs value system/weltanschaung/”form-of-life” (depending on your philosophical frame).
Nevertheless, a ‘hail mary’ justification from @avturchin is that we can perhaps influence proto-AGIs which then pass on their representations (of our projections of their successors’ values) to future systems (e.g. via synthetic data or weight transfer).
Quite like this forecast from Andrew Critch on milestones on AI development, and my reactions:
The timeline he suggests, in ~10 years, we face choice 6a/b, which implies at least 3 possibilities:
A) we need society-level consensus (which might be force-backed) that humans can/should control agents (or digital entities, more generally) who are in all (economically/militarily) relevant aspects, superior to us. Assuming they fit within the moral circle as we currently conceive (@davidchalmers42 @jeffrsebo Thomas Metzinger and Nick Bostrom / Carl Shulman, have analysed this in various ways), and absent some novel claim about how AIs are different/lesser ethical beings, hard to see how this is essentially different from slavery, or animal cruelty, something that will presumably be obvious to any AGI worth the name; or
B) we are able to engineer AI motivations to act harmlessly/subserviently in a way that is “better” than (A), which (wild guesses) could be a form of open individualism or AI-specific conception of identity (e.g Buddhist/Hindu and some indigenous traditions have more radically inclusive conceptions of identity than the Greco-Judeo-Christian human-centred frame that currently dominates AI ethics); or
C) have some very solid person- and species-neutral grounds for why humans (and our ecosystem) are worth preserving, drafted in a way that is at least potentially reasonable in the ontology/value system of the most powerful/friendliest AIs.
To flesh out my thoughts on (C), I (reworking a 2023 LessWrong post by Miller, Häggström, Yampolskiy, Miller) write up a ‘letter to a future AGI’. I suspect this approach to (C) is fundamentally flawed: we can’t predict an AIs value system/weltanschaung/”form-of-life” (depending on your philosophical frame).
Nevertheless, a ‘hail mary’ justification from @avturchin is that we can perhaps influence proto-AGIs which then pass on their representations (of our projections of their successors’ values) to future systems (e.g. via synthetic data or weight transfer).