it is not in fact the case that long term wanting appears in models out of nowhere. but short term wanting can accumulate into long term wanting, and more to the point people are simply trying to build models with long term wanting on purpose.
evolution, which is very fast for replicable software. but more importantly, humans will give ais goals, and from there the point is much more obvious.
“Humans will give the AI goals” doesn’t answer the question as stated. It may or may not answer the underlying concerns.
(Edit: human given goals ar slightly less scary too)
evolution, which is very fast for replicable software
Evolution by random mutation and natural selection are barely applicable here. The question is how would goals and deceit emerge under conditions of artificial selection. Since humans don’t want either, they would have to emerge together.
artificial selection is a subset of natural selection. see also memetic mutation. but why would human-granted goals be significantly less scary? plenty of humans are just going to ask for the most destructive thing they can think of, because they can. if they could, people would have built and deployed nukes at home; even with the knowledge as hard to fully flesh out and the tools as hard to get as they are, it has been attempted (and of course it didn’t get particularly far).
I do agree that the situation we find ourselves in is not quite as dire as if the only kind of ai that worked at all was AIXI-like. but that should be of little reassurance.
I do understand your objection about how goals would arise in the ai, and I’m just not considering the counterfactual you’re requesting deeply because on the point you want to disagree on, I simply agree, and don’t find that it influences my views much.
artificial selection is a subset of natural selection
Yes. The question is: why would we artificially select what’s harmful to us? Even though artificial selection is a subset of natural selection, it’s a different route to danger.
plenty of humans are just going to ask for the most destructive thing they can think of, because they can.
The most destructive thing you can think of will kill you too.
yeah, the people who would do it are not flustered by the idea that it’ll kill them. maximizing doomsday weapon strength just for the hell of it is in fact a thing some people try. unless we can defend against it, it’ll dominate—and it seems to me that current plans for how to defend against the key paths to superweaponhood are not yet plausible. we must end all vulnerabilities in biology and software. serious ideas for how to do that would be appreciated. otherwise, this is my last reply in this thread.
If everybody has some access to ASI, the crazy people do, and the sane people do as well. The good thing about ASI is that even active warfare need not be destructive...the white hats can hold off the black hats even during active warfare, because it’s all fought with bits.
A low power actor would need a physical means to kill everybody...like a supervirus. So those are the portals you need to close.
it is not in fact the case that long term wanting appears in models out of nowhere. but short term wanting can accumulate into long term wanting, and more to the point people are simply trying to build models with long term wanting on purpose.
Again the question is why goals.would arise without human intervention.
evolution, which is very fast for replicable software. but more importantly, humans will give ais goals, and from there the point is much more obvious.
“Humans will give the AI goals” doesn’t answer the question as stated. It may or may not answer the underlying concerns.
(Edit: human given goals ar slightly less scary too)
Evolution by random mutation and natural selection are barely applicable here. The question is how would goals and deceit emerge under conditions of artificial selection. Since humans don’t want either, they would have to emerge together.
artificial selection is a subset of natural selection. see also memetic mutation. but why would human-granted goals be significantly less scary? plenty of humans are just going to ask for the most destructive thing they can think of, because they can. if they could, people would have built and deployed nukes at home; even with the knowledge as hard to fully flesh out and the tools as hard to get as they are, it has been attempted (and of course it didn’t get particularly far).
I do agree that the situation we find ourselves in is not quite as dire as if the only kind of ai that worked at all was AIXI-like. but that should be of little reassurance.
I do understand your objection about how goals would arise in the ai, and I’m just not considering the counterfactual you’re requesting deeply because on the point you want to disagree on, I simply agree, and don’t find that it influences my views much.
Yes. The question is: why would we artificially select what’s harmful to us? Even though artificial selection is a subset of natural selection, it’s a different route to danger.
The most destructive thing you can think of will kill you too.
yeah, the people who would do it are not flustered by the idea that it’ll kill them. maximizing doomsday weapon strength just for the hell of it is in fact a thing some people try. unless we can defend against it, it’ll dominate—and it seems to me that current plans for how to defend against the key paths to superweaponhood are not yet plausible. we must end all vulnerabilities in biology and software. serious ideas for how to do that would be appreciated. otherwise, this is my last reply in this thread.
If everybody has some access to ASI, the crazy people do, and the sane people do as well. The good thing about ASI is that even active warfare need not be destructive...the white hats can hold off the black hats even during active warfare, because it’s all fought with bits.
A low power actor would need a physical means to kill everybody...like a supervirus. So those are the portals you need to close.