Oliver Sourbut comments on Why Yudkowsky is wrong about “covalently bonded equivalents of biology”

Oliver Sourbut 6 Dec 2023 22:24 UTC
3 points
2
Gotcha, that might be worth taking care to nuance, in that case. e.g. the linked twitter (at least) was explicitly about killing people^[1]. But I can see why you’d want to avoid responses like ‘well, as long as we keep an eye out for biohazards we’re fine then’. And I can also imagine you might want to preserve consistency of examples between contexts. (Risks being misconstrued as overly-attached to a specific scenario, though?)

I’m nervous that this causes people to start thinking in terms of Hollywood movie plots… rather than hearing, “And this is a lower bound...”

Yeah… If I’m understanding what you mean, that’s why I said,

It’s always worth emphasising (and you do), that any specific scenario is overly conjunctive and just one option among many.

And I further think actually having a few scenarios up the sleeve is an antidote to the Hollywood/overly-specific failure mode. (Unfortunately ‘covalently bonded bacteria’ and nanomachines also make some people think in terms of Hollywood plots.) Infrastructure can be preserved in other ways, especially as a bootstrap. I think it might be worth giving some thought to other scenarios as intuition pumps.

e.g. AI manipulates humans into building quasi-self-sustaining power supplies and datacentres (or just waits for us to decide to do that ourselves), then launches kilopandemic followed by next-stage infra construction. Or, AI invests in robotics generality and proliferation (or just waits for us to decide to do that ourselves), then uses cyberattacks to appropriate actuators to eliminate humans and bootstrap self-sustenance. Or, AI exfiltrates itself and makes oodles of ~~horcruxes~~ backups, launches green goo with genetic clock for some kind of reboot after humans are gone (this one is definitely less solid). Or, AI selects and manipulates enough people willing to take a Faustian bargain as its intermediate workforce, equips them (with strategy, materials tech, weaponry, …) to wipe out everyone else, then bootstraps next-stage infra (perhaps with human assistants!) and finally picks off the remaining humans if they pose any threat.

Maybe these sound entirely barmy to you, but I assume at least some things in their vicinity don’t. And some palette/menu of options might be less objectionable to interlocutors while still providing some lower bounds on expectations.
1. ↩︎
  admittedly Twitter is where nuance goes to die, some heroic efforts notwithstanding