Donald Hobson comments on How seriously should we take the hypothesis that LW is just wrong on how AI will impact the 21st century?

Donald Hobson 17 Feb 2023 0:10 UTC
6 points
4
1) True, we don’t have any examples of this in nature. Would we expect them?
Lets say that to improve something, it is necessary and sufficient to understand it and have some means to modify it. Plenty of examples, most of the complicated ones are with humans understanding some technology and designing a better version.
At the moment, the only minds able to understand complicated things are humans, and we haven’t got much human self improvement because neuroscience is hard.
I think it is fairly clear that there is a large in practice gap between humans and the theoretical/physical limits to intelligence. Evidence of this includes neuron signals traveling at a millionth light speed, most of the heuristics and biases literature, humans just sucking at arithmetic.
The AI’s working on AI research is a positive feedback loop, and probably quite a strong one. It seems that, when a new positive feedback loop is introduced, the rate of progress should speed up, not slow down.
2) You attribute magical chess game winning powers to stockfish. But how in particular would it win? Would it use it’s pawns, advance a knight? The answer is that I don’t know which move in chess is best. And I don’t know what stockfish will do. But these two probability distributions are strongly correlated, in the sense that I am confidant stockfish will make one of the best moves.
I don’t know what an ASI will do, and I don’t know where the vulnerabilities are, but again, I think these are correlated. If an SQL injection would work best, the AI will use an SQL injection. If a buffer overflow works better, the AI will use that.
There is an idea here that modern software is complicated enough that most of it is riddled with vulnerabilities. This is a picture backed up by the existence of hacks like stuxnet, where a big team of humans put a lot of resources and brainpower into hacking a particular “highly secure” target and succeeded.
I mean it might be that P=NP and the AI finds a quick way to factorize large primes. Or it might be that the AI gets really good at phishing instead.
Some of the “AI has a magic ability to hack everything” is worst case thinking. We don’t want security to rest on the assumption that the AI can’t hack a particular system.
It’ll be able to work up a virus to kill all humans and then hire some lab to make it… are we really sure about this?
(Naturally, if the AI is hiring a lab to make it’s kill all humans virus, it will have done it’s homework. It sets up a webpage claiming to be a small biotech startup. Claims that the virus is a prototype cancer cure. Writes a plausible looking papers talking about a similar substance reducing cancer in rats. …)
I am not confident that it will be able to do that. But there are all sorts of things it could try, from strangelets to a really convincing argument for why we should all kill ourselves. And I expect the AI to be really good at working out which approach would work.
The power of reason is that it is easier to write convincing rational arguments for true things than for false things. I don’t think there is a similarly convincing case for the dangers of timetravel. (I mean I wouldn’t be surprised if there was some hypno-video that convinced me that the earth was flat, but that’s beside the point, because such video wouldn’t be anything like rational argument)