I expect the notkilleveryone threshold is much lower than that. It takes an astronomically tiny fraction of cosmic endowment to maintain a (post)human civilization that’s not too much larger than it currently is. The bigger expenditure would be accomodating humanity at the start, slightly delaying initial self-improvement and expansion from Earth. The cheapest way would be to back up human minds; or if that’s too onerous then even merely the generic code and the Internet (which would be completely free; there is the issue that everyone would die, but human civilization still survives). Then, after the time- and resource-sensitive parts of initial expansion are done, humanity can be instantiated again. So all it takes is enough kindness to slightly divert initial expansion efforts to make a backup, not even to physically leave us be.
I think this sort of kindness should be an explicit alignment target. Even if we completely fail at values alignment and corrigibility, this is the backup plan that should be an actual plan and not just a hope. Even prosaic alignment that tries to instill corporate speak should also do a token effort to try and instill a bit of kindness, even if that probably has no direct effect. Adding kindness to every aspect of AI might still leave a tiny mark. Not even trying makes it less likely.
(Most of my p(extinction) is in recursively self-improving AGIs with simple values built by first human-built AGIs that are not smart enough or too obedient to human operators to not-do/prevent that. So I think being wary of AI x-risk is an even more important trait for AIs to have than kindness, as it takes more of it.)
I expect the notkilleveryone threshold is much lower than that. It takes an astronomically tiny fraction of cosmic endowment to maintain a (post)human civilization that’s not too much larger than it currently is. The bigger expenditure would be accomodating humanity at the start, slightly delaying initial self-improvement and expansion from Earth. The cheapest way would be to back up human minds; or if that’s too onerous then even merely the generic code and the Internet (which would be completely free; there is the issue that everyone would die, but human civilization still survives). Then, after the time- and resource-sensitive parts of initial expansion are done, humanity can be instantiated again. So all it takes is enough kindness to slightly divert initial expansion efforts to make a backup, not even to physically leave us be.
I think this sort of kindness should be an explicit alignment target. Even if we completely fail at values alignment and corrigibility, this is the backup plan that should be an actual plan and not just a hope. Even prosaic alignment that tries to instill corporate speak should also do a token effort to try and instill a bit of kindness, even if that probably has no direct effect. Adding kindness to every aspect of AI might still leave a tiny mark. Not even trying makes it less likely.
(Most of my p(extinction) is in recursively self-improving AGIs with simple values built by first human-built AGIs that are not smart enough or too obedient to human operators to not-do/prevent that. So I think being wary of AI x-risk is an even more important trait for AIs to have than kindness, as it takes more of it.)