Jsevillamol comments on Counterarguments to the basic AI x-risk case

Jsevillamol Oct 14, 2022, 8:00 PM
19 points
2
Here are my quick takes from skimming the post.

In short, the arguments I think are best are A1, B4, C3, C4, C5, C8, C9 and D. I don’t find any of them devastating.
A1. Different calls to ‘goal-directedness’ don’t necessarily mean the same concept
I am not sure I parse this one.I am reading it as “AI systems might be more like imitators than optimizers” from the example, which I find moderately persuasive
A2. Ambiguously strong forces for goal-directedness need to meet an ambiguously high bar to cause a risk
I am not sure I understand this one either.I am reading it as “there might be no incentive for generality” which I dont find persuasive—I think there is a strong incentive
B1. Small differences in utility functions may not be catastrophic
I dont find this persuasive. I think the evidence from optimization theory setting variables to extreme values is suggestive enough to suggest this is not the default
B2. Differences between AI and human values may be small
B3. Maybe value isn’t fragile
The only example we have of general intelligence (humans) seems to have strayed pretty far from evolutionary incentives, so I find this unpersuasive
B4. [AI might only care about]Short-term goals
I find that somewhat persuasive, or at least not obviously wrong, similar to A1. There is a huge incentive for instilling long term thinking though.
C1. Human success isn’t from individual intelligence
I dont find this persuasive. Im not convinced there is a meaningful difference between “a single AGI” and “a society of AGIs”. A single AGI could be running a billion independent threads of thought and outspeed humans.
C2. AI agents may not be radically superior to combinations of humans and non-agentic machines
I dont find this persuasive. Seems unlikely that human-in-the-loop is going to have any advantages over pure machines.
C3. Trust
I find this plausible but not convincing
C4. Headroom
Plausible but not convincing. I dont find any of the particular examples of lack of headroom convincing, and I think the prior should be that there is a lot of headroom
C5. Intelligence may not be an overwhelming advantage
I find this moderately persuasive though not entirely convincing
C6. Unclear that many goals realistically incentivise taking over the universe
I find this unconvincing. I think there are many reasons to expect that taking over the universe is a convergent goal.
C7. Quantity of new cognitive labor is an empirical question, not addressed
I dont find this superpersuasive. In particular I think there is a good chance that once we have AGI we will be in a hardware overhang and be able to execute tons of AGI-equivalents
C8. Speed of intelligence growth is ambiguous
I find this plausible
C9. Key concepts are vague
Granted but not a refutation in itself
D1. The argument overall proves too much about corporations
I find this somewhat persuasive