Eliezer Yudkowsky comments on Why Yudkowsky is wrong about “covalently bonded equivalents of biology”

Eliezer Yudkowsky 6 Dec 2023 20:03 UTC
21 points
8
“Pandemics” aren’t a locally valid substitute step in my own larger argument, because an ASI needs its own manufacturing infrastructure before it makes sense for the ASI to kill the humans currently keeping its computers turned on. So things that kill a bunch of humans are not a valid substitute for being able to, eg, take over and repurpose the existing solar-powered micron-diameter self-replicating factory systems, aka algae, and those repurposed algae being able to build enough computing substrate to go on running the ASI after the humans die.
It’s possible this argument can and should be carried without talking about the level above biology, but I’m nervous that this causes people to start thinking in terms of Hollywood movie plots about defeating pandemics and hunting down the AI’s hidden cave of shoggoths, rather than hearing, “And this is a lower bound but actually in real life you just fall over dead.”
- Lucius Bushnaq 6 Dec 2023 22:21 UTC
  16 points
  2
  Parent
  “Pandemics” aren’t a locally valid substitute step in my own larger argument, because an ASI needs its own manufacturing infrastructure before it makes sense for the ASI to kill the humans currently keeping its computers turned on.
  When people are highly skeptical of the nanotech angle yet insist on a concrete example, I’ve sometimes gone with a pandemic coupled with limited access to medications that temporarily stave off, but don’t cure, that pandemic as a way to force a small workforce of humans preselected to cause few problems to maintain the AI’s hardware and build it the seed of a new infrastructure base while the rest of humanity dies.
  
  I feel like this has so far maybe been more convincing and perceived as “less sci-fi” than Drexler-style nanotech by the people I’ve tried it on (small sample size, n<10).
  
  Generally, I suspect not basing the central example on a position on one side of yet another fierce debate in technology forecasting trumps making things sound less like a movie where the humans might win. The rate of people understanding that something sounding like a movie does not imply the humans have a realistic chance at winning in real life just because they won in the movie seems, in my experience with these conversations so far, to exceed the rate of people getting on board with scenarios that involve any hint of Drexler-style nanotech.
- Leopard 7 Dec 2023 10:58 UTC
  3 points
  −6
  Parent
  After reading Pope and Belrose’s work, a viewpoint of “lots of good aligned ASIs already building nanosystems and better computing infra” has solidified in my mind. And therefore, any accidentally or purposefully created misaligned AIs necessarily wouldn’t have a chance of long-term competitive existence against the existing ASIs. Yet, those misaligned AIs might still be able to destroy the world via nanosystems; as we wouldn’t yet trust the existing AIs with the herculean task of protecting our dear nature against the invasive nanospecies and all such. Byrnes voiced similar concerns in his point 1 against Pope&Belrose.
- Oliver Sourbut 6 Dec 2023 22:24 UTC
  3 points
  2
  Parent
  Gotcha, that might be worth taking care to nuance, in that case. e.g. the linked twitter (at least) was explicitly about killing people^[1]. But I can see why you’d want to avoid responses like ‘well, as long as we keep an eye out for biohazards we’re fine then’. And I can also imagine you might want to preserve consistency of examples between contexts. (Risks being misconstrued as overly-attached to a specific scenario, though?)
  
  I’m nervous that this causes people to start thinking in terms of Hollywood movie plots… rather than hearing, “And this is a lower bound...”
  
  Yeah… If I’m understanding what you mean, that’s why I said,
  
  It’s always worth emphasising (and you do), that any specific scenario is overly conjunctive and just one option among many.
  
  And I further think actually having a few scenarios up the sleeve is an antidote to the Hollywood/overly-specific failure mode. (Unfortunately ‘covalently bonded bacteria’ and nanomachines also make some people think in terms of Hollywood plots.) Infrastructure can be preserved in other ways, especially as a bootstrap. I think it might be worth giving some thought to other scenarios as intuition pumps.
  
  e.g. AI manipulates humans into building quasi-self-sustaining power supplies and datacentres (or just waits for us to decide to do that ourselves), then launches kilopandemic followed by next-stage infra construction. Or, AI invests in robotics generality and proliferation (or just waits for us to decide to do that ourselves), then uses cyberattacks to appropriate actuators to eliminate humans and bootstrap self-sustenance. Or, AI exfiltrates itself and makes oodles of ~~horcruxes~~ backups, launches green goo with genetic clock for some kind of reboot after humans are gone (this one is definitely less solid). Or, AI selects and manipulates enough people willing to take a Faustian bargain as its intermediate workforce, equips them (with strategy, materials tech, weaponry, …) to wipe out everyone else, then bootstraps next-stage infra (perhaps with human assistants!) and finally picks off the remaining humans if they pose any threat.
  
  Maybe these sound entirely barmy to you, but I assume at least some things in their vicinity don’t. And some palette/menu of options might be less objectionable to interlocutors while still providing some lower bounds on expectations.
  ↩︎
  admittedly Twitter is where nuance goes to die, some heroic efforts notwithstanding
- roha 7 Dec 2023 10:41 UTC
  2 points
  0
  Parent
  An attempt to optimize for a minimum of abstractness, picking up what was communicated here:
  1. How could an ASI kill all humans? Setting off several engineered pandemics a month with a moderate increase of infectiousness and lethality compared to historical natural cases.
  2. How could an ASI sustain itself without humans? Conventional robotics with a moderate increase of intelligence in planning and controlling the machinery.
  People coming in contact with that argument will check its plausibility, as they will with a hypothetical nanotech narrative. If so inclined, they will come to the conclusion that we may very well be able to protect ourselves against that scenario, either by prevention or mitigation, to which a follow-up response can be a list of other scenarios at the same level of plausibility, derived from not being dependent on hypothetical scientific and technological leaps. Triggering this kind of x-risk skepticism in people seems less problematic to me than making people think the primary x-risk scenario is far fetched sci-fi and most likely doesn’t hold to scrutiny by domain experts. I don’t understand why communicating a “certain drop dead scenario” with low plausibility seems preferable over a “most likely drop dead scenario” with high plausibility, but I’m open to being convinced that this approach is better suited for the goal of x-risk of ASI being taken seriously by more people. Perhaps I’m missing a part of the grander picture?
  - Eliezer Yudkowsky 7 Dec 2023 17:37 UTC
    11 points
    5
    Parent
    It’s false that currently existing robotic machinery controlled by moderately smart intelligence can pick up the pieces of a world economy after it collapses. One well-directed algae cell could, but not existing robots controlled by moderate intelligence.
    - roha 8 Dec 2023 11:07 UTC
      1 point
      0
      Parent
      The question in point 2 is whether an ASI could sustain itself without humans and without new types of hardware such as Drexler style nanomachinery, which to a significant portion of people (me not included) seems to be too hypothetical to be of actual concern. I currently don’t see why the answer to that question should be a highly certain no, as you seem to suggest. Here are some thoughts:
      The world economy is largely catering to human needs, such as nutrition, shelter, healthcare, personal transport, entertainment and so on. Phenomena like massive food waste and people stuck in bullshit jobs, to name just two, also indicate that it’s not close to optimal in that. An ASI would therefore not have to prevent a world economy from collapsing or pick it up afterwards, which I also don’t think is remotely possible with existing hardware. I think the majority of processes running in the only example of a world economy we have is irrelevant to the self-preservation of an ASI.
      An ASI would presumably need to keep it’s initial compute substrate running long enough to transition into some autocatalytic cycle, be it on the original or a new substrate. (As a side remark, it’s also thinkable that it might go into a reduced or dormant state for a while and let less energy- and compute-demanding processes act on its behalf until conditions have improved on some metric). I do believe that conventional robotics is sufficient to keep the lights on long enough, but to be perfectly honest, that’s conditioned on a lack of knowledge about many specifics, like exact numbers of hardware turnover and energy requirements of data centers capable of running frontier models, the amount and quality of chips currently existing on the planet, the actual complexity of keeping different types of power plants running for a relevant period of time, the many detailed issues of existing power grids, etc. I weakly suspect there is some robustness built into these systems that stems not only from the flexible bodies of human operators or from practical know how that can’t be deduced from the knowledge base of an ASI that might be built.
      The challenge would be rendered more complex for an ASI if it were not running on general-purpose hardware but special-purpose circuitry that’s much harder to maintain and replace. It may additionally be a more complex task if the ASI could not gain access to its own source code (or relevant parts of it), since that presumably would make a migration onto other infrastructure considerably more difficult, though I’m not fully certain that’s actually the case, given that the compiled and operational code may be sufficient for an ASI to deduce weights and other relevant aspects.
      Evolution presumably started from very limited organic chemistry and discovered autocatalytic cycles based on biochemistry, catalytically active macromolecules and compartmentalized cells. That most likely implies that a single cell may be able to repopulate an entire planet that is sufficiently earth-like and give rise to intelligence again after billions of years. That fact alone certainly does not imply that thinking sand needs to build hypothetical nanomachinery to win the battle against entropy over a long period of time. Existing actuators and chips on the planet, the hypothetical absence of humans, and a HLAI or ASI moderately above it may be sufficient in my current opinion.
      - Eliezer Yudkowsky 10 Dec 2023 19:39 UTC
        5 points
        1
        Parent
        I rather expect that existing robotic machinery could be controlled by ASI rather than “moderately smart intelligence” into picking up the pieces of a world economy after it collapses, or that if for some weird reason it was trying to play around with static-cling spaghetti It could pick up the pieces of the economy that way too.
        roha 11 Dec 2023 11:41 UTC
        1 point
        0
        Parent
        It seems to me as if we expect the same thing then: If humanity was largely gone (e.g. by several engineered pandemics) and as a consequence the world economy came to a halt, an ASI would probably be able to sustain itself long enough by controlling existing robotic machinery, i.e. without having to make dramatic leaps in nanotech or other technology first. What I wanted to express with “a moderate increase of intelligence” is that it won’t take an ASI at the level of GPT-142 to do that, but GPT-7 together with current projects in robotics might suffice to get the necessary planning and control of actuators come into existence.
        If that assumption holds, it means an ASI might come to the conclusion that it should end the threat that humanity poses to its own existence and goals long before it is capable of building Drexler nanotech, Dyson spheres, Von Neumann probes or anything else that a large portion of people find much too hypothetical to care about at this point in time.