I find many pathways to hostile or simply amoral AI plausible. I also find many potential security problems in supposedly safe approaches plausible. And I find the criticism of existing alignment systems plausible.
But “we are not sure whether the security will work, and we can imagine bad things emerging, and our solutions for that seem insufficient” seems a huge step from claiming a near-*certainty* of AGI emerging evil and killing literally everyone.
I do not understand where this certainty comes from, when dealing with systems that are, by their nature, hypothetical and unknown. A lot of the explanations feel like that is the goal you are trying to prove, and then you go back from that.
E.g. I find it plausible that many (not all) AIs will automatically begin seeking power and resources and self-preservation as one of their goals, to a degree.
I do not find it plausible that that automatically entails killing every tiniest threat and devouring every last atom. And the argument for the former does not seem to carry the latter
Like, I can build a story that gets to that end result. An AI that wants maximum safety and power above all else, and takes no chances. - But I do not see why that would be the only story.
As a person who is, myself, extremely uncertain about doom—I would say that doom-certain voices are disproportionately outspoken compared to uncertain ones, and uncertain ones are in turn outspoken relative to voices generally skeptical of doom. That doesn’t seem too surprising to me, since (1) the founder of the site, and the movement, is an outspoken voice who believes in high P(doom); and (2) the risks are asymmetrical (much better to prepare for doom and not need it, than to need preparation for doom and not have it.)
I think one of the things rationalists try to do is take the numbers seriously from a consequentialist/utilitarian perspective. This means that even if there’s a small chance of doom, you should put vast resources towards preventing it since the expected loss is high.
I think this makes people think that the expectations of doom in the community are much higher than they actually are, because the expected value of preventing doom is so high.
From how discrepancy between temp/resources allocated to alignment research and capability research looks to lay person (to me), the doom scenario is closer to a lottery than to a story. I don’t see why it would be winning number. I 99,999 sure that ASI will be proactive (and all kind of synonyms to this word). It all mostly can be summarised with “fast takeoff” and “human values are fragile”.
I do find the discrepancy deeply worrying, and have argued before that calling for more safety funding (and potentially engaging in civil disobedience for it) may be one of the most realistic and effectual goals for AI safety activism. I do think it is ludicrous to spend so little on it in comparison.
But again… does this really translate to a proportional probability of doom? I don’t find it intuitively completely implausible that getting a capable AI requires more money than aligning an AI. In part because I can imagine lucky sets of coincidences that lead to AIs gaining some alignment through interaction with aligned humans and the consumption of human-aligned training data, but cannot really imagine lucky sets of coincidences that lead to humans accidentally inventing an artificial intelligence. It seems like the latter needs funding and precision in all worlds, while in the former, it would merely seem extremely desirable, not 100 % necessary—or at least not to the same degree.
(Analogously, humans have succeeded in raising ethical children, or taming animals successfully, even if they often did not really know what they were doing. However, the human track record in creating artificial life is characterised by a need for extreme precision and lengthy trial and error, and a lot of expense. I find it more plausible that a poor Frankenstein would manage to make his monster friendly by not treating it like garbage, than that a poor Frankenstein would manage to create a working zombie while poor.)
I do find the discrepancy deeply worrying, and have argued before that calling for more safety funding (and potentially engaging in civil disobedience for it) may be one of the most realistic and effectual goals for AI safety activism.
OpenAI is the result of calling for safety funding.
There’s generally not much confidence that a highly political project where a lot of money is spent in the name of safety research will actually produce safety.
I don’t think aligning AI requires more money than creating a capable AI. The problem is that AI alignment looks like a long term research project, while AGI capability is looking like a much shorter term development project that merely requires a lot of mostly known resources. So on current trajectories, highly capable AGI will have largely unknown alignment.
This is absolutely not a thing we should leave to chance. Early results from recent pre-AGIs are much more in line with my more pessimistic concerns than with my optimistic hopes. I’m still far from certain of doom, but I still think we as a civilization are batshit insane for pursuing AGI without having extremely solid foundations to ensure that it will be safe and stay safe.
To use the analogy Bostrom uses, of the sparrows dragging an owl egg into their home to incubate, because they think a tame owl would be neat; I think tame owls are totally possible, and I wouldn’t say that I can be certain all those sparrows will get snacked, but I would definitely say that the sparrows are being bloody stupid, and that I would be one of the sparrows focussing on trying to condition the owl to be good, rather than overfeeding it so it grows even more quickly into a size where sparrows become snack sized.
We might be in a somewhat better position, because owls are hard wired predators (I assume Bostrom deliberately chose them because they are large birds hunting small animals, notoriously destructive, and notoriously hard to tame) and what we dragged home is basically an egg for a completely unknown animal, which could, through sheer coincidence, be friendly (maybe we got a literal black swan? They are huge, but vegan), or, slightly more plausibly, at least be more malleable to adapt sparrow customs than an owl would be (parrots are extremely smart, friendly/social, and mostly vegetarians), so we might be luckier. I mean, it currently looks like the weird big bird is mostly behaving, but we are worried it isn’t behaving right for the right reasons, and may very well stop once it gets larger. And yet everyone is carting home more random eggs and pouring in food faster so they get the biggest bird. This whole situation does give me nightmares.
>But again… does this really translate to a proportional probability of doom?
If you buy a lottery ticket and get all (all out of n) numbers right, then you have glorious transhumanists utopia (still some people will get very upset). And if you get wrong a single number, then you get a weirdtopia and may be distopia. There is an unknown quantity of numbers to guess, and single ticket cost a billion now (and here enters the discrepancy). Where i get so many losing tickets? From Mind Design Space. There is also and alternative that suggests that space of possibilities is much smaller.
It is not enough to get some alignment, and it seems that we need to get clear on difference between utility maximisers (ASI and AGI) and behavior executors (humans and dogs and monkeys). That’s is where “AGI is proactive (and synonyms)” part based on.
So the probability of doom is proportioned to the probability of buying a losing (not getting all numbers right) ticket.
How are so many of you this certain of doom?
I find many pathways to hostile or simply amoral AI plausible. I also find many potential security problems in supposedly safe approaches plausible. And I find the criticism of existing alignment systems plausible.
But “we are not sure whether the security will work, and we can imagine bad things emerging, and our solutions for that seem insufficient” seems a huge step from claiming a near-*certainty* of AGI emerging evil and killing literally everyone.
I do not understand where this certainty comes from, when dealing with systems that are, by their nature, hypothetical and unknown. A lot of the explanations feel like that is the goal you are trying to prove, and then you go back from that.
E.g. I find it plausible that many (not all) AIs will automatically begin seeking power and resources and self-preservation as one of their goals, to a degree.
I do not find it plausible that that automatically entails killing every tiniest threat and devouring every last atom. And the argument for the former does not seem to carry the latter
Like, I can build a story that gets to that end result. An AI that wants maximum safety and power above all else, and takes no chances. - But I do not see why that would be the only story.
As a person who is, myself, extremely uncertain about doom—I would say that doom-certain voices are disproportionately outspoken compared to uncertain ones, and uncertain ones are in turn outspoken relative to voices generally skeptical of doom. That doesn’t seem too surprising to me, since (1) the founder of the site, and the movement, is an outspoken voice who believes in high P(doom); and (2) the risks are asymmetrical (much better to prepare for doom and not need it, than to need preparation for doom and not have it.)
I think one of the things rationalists try to do is take the numbers seriously from a consequentialist/utilitarian perspective. This means that even if there’s a small chance of doom, you should put vast resources towards preventing it since the expected loss is high.
I think this makes people think that the expectations of doom in the community are much higher than they actually are, because the expected value of preventing doom is so high.
While rationalist would take small numbers seriously, a lot of rationalists do have two-digit percentages of chance of doom.
It’s not like Asteroid or Yellowstone where you have a very low risk that’s still worth being taken seriously.
From how discrepancy between temp/resources allocated to alignment research and capability research looks to lay person (to me), the doom scenario is closer to a lottery than to a story. I don’t see why it would be winning number. I 99,999 sure that ASI will be proactive (and all kind of synonyms to this word). It all mostly can be summarised with “fast takeoff” and “human values are fragile”.
I do find the discrepancy deeply worrying, and have argued before that calling for more safety funding (and potentially engaging in civil disobedience for it) may be one of the most realistic and effectual goals for AI safety activism. I do think it is ludicrous to spend so little on it in comparison.
But again… does this really translate to a proportional probability of doom? I don’t find it intuitively completely implausible that getting a capable AI requires more money than aligning an AI. In part because I can imagine lucky sets of coincidences that lead to AIs gaining some alignment through interaction with aligned humans and the consumption of human-aligned training data, but cannot really imagine lucky sets of coincidences that lead to humans accidentally inventing an artificial intelligence. It seems like the latter needs funding and precision in all worlds, while in the former, it would merely seem extremely desirable, not 100 % necessary—or at least not to the same degree.
(Analogously, humans have succeeded in raising ethical children, or taming animals successfully, even if they often did not really know what they were doing. However, the human track record in creating artificial life is characterised by a need for extreme precision and lengthy trial and error, and a lot of expense. I find it more plausible that a poor Frankenstein would manage to make his monster friendly by not treating it like garbage, than that a poor Frankenstein would manage to create a working zombie while poor.)
OpenAI is the result of calling for safety funding.
There’s generally not much confidence that a highly political project where a lot of money is spent in the name of safety research will actually produce safety.
I don’t think aligning AI requires more money than creating a capable AI. The problem is that AI alignment looks like a long term research project, while AGI capability is looking like a much shorter term development project that merely requires a lot of mostly known resources. So on current trajectories, highly capable AGI will have largely unknown alignment.
This is absolutely not a thing we should leave to chance. Early results from recent pre-AGIs are much more in line with my more pessimistic concerns than with my optimistic hopes. I’m still far from certain of doom, but I still think we as a civilization are batshit insane for pursuing AGI without having extremely solid foundations to ensure that it will be safe and stay safe.
Oh, hard agree on that.
To use the analogy Bostrom uses, of the sparrows dragging an owl egg into their home to incubate, because they think a tame owl would be neat; I think tame owls are totally possible, and I wouldn’t say that I can be certain all those sparrows will get snacked, but I would definitely say that the sparrows are being bloody stupid, and that I would be one of the sparrows focussing on trying to condition the owl to be good, rather than overfeeding it so it grows even more quickly into a size where sparrows become snack sized.
We might be in a somewhat better position, because owls are hard wired predators (I assume Bostrom deliberately chose them because they are large birds hunting small animals, notoriously destructive, and notoriously hard to tame) and what we dragged home is basically an egg for a completely unknown animal, which could, through sheer coincidence, be friendly (maybe we got a literal black swan? They are huge, but vegan), or, slightly more plausibly, at least be more malleable to adapt sparrow customs than an owl would be (parrots are extremely smart, friendly/social, and mostly vegetarians), so we might be luckier. I mean, it currently looks like the weird big bird is mostly behaving, but we are worried it isn’t behaving right for the right reasons, and may very well stop once it gets larger. And yet everyone is carting home more random eggs and pouring in food faster so they get the biggest bird. This whole situation does give me nightmares.
>But again… does this really translate to a proportional probability of doom?
If you buy a lottery ticket and get all (all out of n) numbers right, then you have glorious transhumanists utopia (still some people will get very upset). And if you get wrong a single number, then you get a weirdtopia and may be distopia. There is an unknown quantity of numbers to guess, and single ticket cost a billion now (and here enters the discrepancy). Where i get so many losing tickets? From Mind Design Space. There is also and alternative that suggests that space of possibilities is much smaller.
It is not enough to get some alignment, and it seems that we need to get clear on difference between utility maximisers (ASI and AGI) and behavior executors (humans and dogs and monkeys). That’s is where “AGI is proactive (and synonyms)” part based on.
So the probability of doom is proportioned to the probability of buying a losing (not getting all numbers right) ticket.