I think the “non universal optimizer” point is crucial; that really does seem to be a weakness in many of the canonical arguments. And as you point out elsewhere, humans don’t seem to be universal optimizers either.
Do you think there’s “human risk,” in the sense that giving a human power might lead to bad outcomes? If so, then why wouldn’t the same apply to AIs that aren’t universal optimizers?
It seems to me that one could argue that humans have various negative drives, that we could just not program into the AI, but I think this misses several important points. For example, one negative behavior humans do is ‘game the system,’ where they ignore the spirit of regulations while following their letter, or use unintended techniques to get high scores. But it seems difficult to build a system that can do any better than its training data without having it fall prey to ‘gaming the system.’ One needs to not just convey the goal in terms of rewards, but the full concept around what’s desired and what’s not desired.
I agree that non-universal-optimizers are not necessarily safe. There’s a reason I wrote “many” not “all” canonical arguments. In addition to gaming the system, there’s also the time honored technique of rewriting the rules. I’m concerned about possible feedback loops. Evolution brought about the values we know and love in a very specific environment. If that context changes while evolution accelerates, I foresee a problem.
Human beings have succeeded so far in not wiping themselves out. The fossil record, as far as we can tell, leaves no trace of technological civilizations that wiped themselves out. So the evidence so far points against existential risk from putting people in positions of power. (It’s an aside, but the history of humanity has actually shown that centralizing power actually reduces violence, and that the periods of greatest strife coincide with anarchy, e.g. the invasion of the sea peoples.)
Even that aside, I don’t think anyone is seriously considering building an omnipotent overlord AI and putting it in charge of the world, are they? That sounds like an utterly dystopian future I’d want no part in personally. So the question is really will groups of machine intelligences and humans, or more likely humans augmented by machine intelligences do better than baseline humans regarding societal governance and risk. In other words, an environment where no one individual (human or machine) has absolute sovereign control, but rather lives in accordance with the enforced rules of society, even if there are differing distributions of power—no one and no thing is above the law. I have not, so far, seen any compelling evidence that the situation here with machines is any different than with humans, or that either is qualitatively different from the status quo.
Human beings have succeeded so far in not wiping themselves out. The fossil record, as far as we can tell, leaves no trace of technological civilizations that wiped themselves out.
Even that aside, I don’t think anyone is seriously considering building an omnipotent overlord AI and putting it in charge of the world, are they? That sounds like an utterly dystopian future I’d want no part in personally.
This seems like a natural consequence of predictable incentives to me. For example, potentially biased and corrupt police get replaced by robocops, who are cheaper and replaceable. As soon as it becomes possible to make an AI manager, I expect companies that use them to start seeing gains relative to companies that don’t. And if it works for companies, it seems likely to work for politicians. And...
So the question is really will groups of machine intelligences and humans, or more likely humans augmented by machine intelligences do better than baseline humans regarding societal governance and risk.
I think ‘groups of machine intelligences’ has connotations that I don’t buy. For example, everyone has Siri in their pocket, but there’s only one Siri; there won’t be a social class of robot doctors, there will just be Docbot, who knows everyone’s medical data (and as a result can make huge advances in medical science and quality of treatment). And in that context, it doesn’t seem surprising that you might end up with Senatebot that knows everyone’s political preferences and writes laws accordingly.
Do you think there’s “human risk,” in the sense that giving a human power might lead to bad outcomes? If so, then why wouldn’t the same apply to AIs that aren’t universal optimizers?
It seems to me that one could argue that humans have various negative drives, that we could just not program into the AI, but I think this misses several important points. For example, one negative behavior humans do is ‘game the system,’ where they ignore the spirit of regulations while following their letter, or use unintended techniques to get high scores. But it seems difficult to build a system that can do any better than its training data without having it fall prey to ‘gaming the system.’ One needs to not just convey the goal in terms of rewards, but the full concept around what’s desired and what’s not desired.
I agree that non-universal-optimizers are not necessarily safe. There’s a reason I wrote “many” not “all” canonical arguments. In addition to gaming the system, there’s also the time honored technique of rewriting the rules. I’m concerned about possible feedback loops. Evolution brought about the values we know and love in a very specific environment. If that context changes while evolution accelerates, I foresee a problem.
Human beings have succeeded so far in not wiping themselves out. The fossil record, as far as we can tell, leaves no trace of technological civilizations that wiped themselves out. So the evidence so far points against existential risk from putting people in positions of power. (It’s an aside, but the history of humanity has actually shown that centralizing power actually reduces violence, and that the periods of greatest strife coincide with anarchy, e.g. the invasion of the sea peoples.)
Even that aside, I don’t think anyone is seriously considering building an omnipotent overlord AI and putting it in charge of the world, are they? That sounds like an utterly dystopian future I’d want no part in personally. So the question is really will groups of machine intelligences and humans, or more likely humans augmented by machine intelligences do better than baseline humans regarding societal governance and risk. In other words, an environment where no one individual (human or machine) has absolute sovereign control, but rather lives in accordance with the enforced rules of society, even if there are differing distributions of power—no one and no thing is above the law. I have not, so far, seen any compelling evidence that the situation here with machines is any different than with humans, or that either is qualitatively different from the status quo.
I don’t find that reassuring.
This seems like a natural consequence of predictable incentives to me. For example, potentially biased and corrupt police get replaced by robocops, who are cheaper and replaceable. As soon as it becomes possible to make an AI manager, I expect companies that use them to start seeing gains relative to companies that don’t. And if it works for companies, it seems likely to work for politicians. And...
I think ‘groups of machine intelligences’ has connotations that I don’t buy. For example, everyone has Siri in their pocket, but there’s only one Siri; there won’t be a social class of robot doctors, there will just be Docbot, who knows everyone’s medical data (and as a result can make huge advances in medical science and quality of treatment). And in that context, it doesn’t seem surprising that you might end up with Senatebot that knows everyone’s political preferences and writes laws accordingly.