AI is a caveman looking at a group of cavemen shamans performing a strange ritual that summons many strange and somewhat useful beings. Some of the shamans say they’re going to eventually summon a Messiah that will solve all problems. Others say that there’s a chance to summon a world-devouring demon, by mistake. Yet others say neither are going to happen, that the sprites and elementals that the ritual has been bringing into the world are all the ritual can do.
Who should the caveman listen to and why? For bonus points, try sticking to the frame of the analogy.
Given that the process of scientific research has many AGI traits (opaque, self-improving, amoral as a whole), I wonder how rational it is for laypersons to trust it. I suspect the answer is, not very. Primarily because, just like an AGI improving itself, it doesn’t seem to be possible for anyone, not even insiders in the process, to actually guarantee the process will not, in its endless iterations, produce an X-risk. And indeed, said process is the only plausible source of manmade X-risk. This is basically Bostrom’s technological black ball thought experiment in the Vulnerable World Hypothesis. But Bostrom’s proposed solution is to double down, with his panopticon.
I have an intuition that such instances of doubling down are indications the scientific research process itself is misaligned.
Alignment/misalignment makes more sense for something agentic, pursuing a goal. To be safe, an agent must be aligned, but another way of making a possibly-misaligned possibly-agent safer involves making it less agentic. A dangerous non-agent is not misaligned, its danger must be understood in some other way that would actually explain what’s going on.
The thing is though, there isn’t a dichotomy between agents and processes. Everything physical (except maybe the elementary particles) is a process in the final analysis, as Heraclitus claimed. Even actual individual persons, the paradigmatic examples of an agent, are also processes: the activity of the brain and body only ever stop at death. The appearance of people as monadic agents is just that, an appearance, and not actually real.
This might have sounded too much philosophical woo-woo, but it does have pragmatic consequences, which is that since agents are a facade over what is actually a process, the question becomes how do you actually tell which processes do or do not have goals? It’s not obvious that only processes that can pass as agents are the only ones that have goals.
EDIT: Think about it like this: when a river floods and kills hundreds or thousands, was it misaligned? Talk of agents and alignment only makes sense in certain contexts, and only as a heuristic! And I think AI X-risk is a context in which talking in terms of agents and alignment obfuscates enough critical features of the subject that the discourse starts excluding genuine understanding.
EDIT 2: The above edit being mostly a different way of agreeing with you. I guess my original point is “The scientific research process is dangerous, and for the same reasons rogue superintelligence would be: opacity, constantly increasing capabilities, and difficulty of guaranteeing alignment”. I still disagree with you (and with my own example of the river actually) that non-agents (processes) can’t be misaligned. All natural forces can be fairly characterized as misaligned, as they are indifferent to our values, and this does not make them agentic (that would be animism). In fact, I would say “a dangerous non-agent is not misaligned” is false, and “a dangerous non-agent is misaligned” is a tautology in most contexts (depends on to whom it is dangerous).
I started an AI X-Risk awareness twitter account. Introducing @GoodVibesNoAI. It’s about collating reasons to believe civilization will collapse before it gets to spawn a rogue superintelligence that consumes all matter in the Laniakea supercluster. A good outcome, all things considered.
What do you think about it? Any particular people to follow? Considered also doing a weekly roundup of the articles I post on it and making a weekly newsletter with them.
It’s about collating reasons to believe civilization will collapse before it gets to spawn a rogue superintelligence that consumes all matter in the Laniakea supercluster. A good outcome, all things considered.
an honest belief of yours, or satire? There are some arguments that unaligned AI systems might be morally valuable.
Civilization collapsing is blatantly better than rogue superintelligence, as it’s plausibly a recoverable disaster, so yes, that is my honest belief. I don’t consider non-organics to be moral entities, since I also believe they’re not sentient. Yeah, I’m aware those views are contested, but then, what the hell isn’t when it comes to philosophy. There are philosophers who argue for post-intentionalism, the view that our words, language and thoughts aren’t actually about anything, for crying out loud.
A lucid analogy:
AI is a caveman looking at a group of cavemen shamans performing a strange ritual that summons many strange and somewhat useful beings. Some of the shamans say they’re going to eventually summon a Messiah that will solve all problems. Others say that there’s a chance to summon a world-devouring demon, by mistake. Yet others say neither are going to happen, that the sprites and elementals that the ritual has been bringing into the world are all the ritual can do.
Who should the caveman listen to and why? For bonus points, try sticking to the frame of the analogy.
The shamans, because Pascal’s Wager.
Given that the process of scientific research has many AGI traits (opaque, self-improving, amoral as a whole), I wonder how rational it is for laypersons to trust it. I suspect the answer is, not very. Primarily because, just like an AGI improving itself, it doesn’t seem to be possible for anyone, not even insiders in the process, to actually guarantee the process will not, in its endless iterations, produce an X-risk. And indeed, said process is the only plausible source of manmade X-risk. This is basically Bostrom’s technological black ball thought experiment in the Vulnerable World Hypothesis. But Bostrom’s proposed solution is to double down, with his panopticon.
I have an intuition that such instances of doubling down are indications the scientific research process itself is misaligned.
Alignment/misalignment makes more sense for something agentic, pursuing a goal. To be safe, an agent must be aligned, but another way of making a possibly-misaligned possibly-agent safer involves making it less agentic. A dangerous non-agent is not misaligned, its danger must be understood in some other way that would actually explain what’s going on.
The thing is though, there isn’t a dichotomy between agents and processes. Everything physical (except maybe the elementary particles) is a process in the final analysis, as Heraclitus claimed. Even actual individual persons, the paradigmatic examples of an agent, are also processes: the activity of the brain and body only ever stop at death. The appearance of people as monadic agents is just that, an appearance, and not actually real.
This might have sounded too much philosophical woo-woo, but it does have pragmatic consequences, which is that since agents are a facade over what is actually a process, the question becomes how do you actually tell which processes do or do not have goals? It’s not obvious that only processes that can pass as agents are the only ones that have goals.
EDIT: Think about it like this: when a river floods and kills hundreds or thousands, was it misaligned? Talk of agents and alignment only makes sense in certain contexts, and only as a heuristic! And I think AI X-risk is a context in which talking in terms of agents and alignment obfuscates enough critical features of the subject that the discourse starts excluding genuine understanding.
EDIT 2: The above edit being mostly a different way of agreeing with you. I guess my original point is “The scientific research process is dangerous, and for the same reasons rogue superintelligence would be: opacity, constantly increasing capabilities, and difficulty of guaranteeing alignment”. I still disagree with you (and with my own example of the river actually) that non-agents (processes) can’t be misaligned. All natural forces can be fairly characterized as misaligned, as they are indifferent to our values, and this does not make them agentic (that would be animism). In fact, I would say “a dangerous non-agent is not misaligned” is false, and “a dangerous non-agent is misaligned” is a tautology in most contexts (depends on to whom it is dangerous).
I started an AI X-Risk awareness twitter account. Introducing @GoodVibesNoAI. It’s about collating reasons to believe civilization will collapse before it gets to spawn a rogue superintelligence that consumes all matter in the Laniakea supercluster. A good outcome, all things considered.
What do you think about it? Any particular people to follow? Considered also doing a weekly roundup of the articles I post on it and making a weekly newsletter with them.
I guess the Centre for Applied Eschatology would be right up your alley.
I’m curious, is
an honest belief of yours, or satire? There are some arguments that unaligned AI systems might be morally valuable.
Civilization collapsing is blatantly better than rogue superintelligence, as it’s plausibly a recoverable disaster, so yes, that is my honest belief. I don’t consider non-organics to be moral entities, since I also believe they’re not sentient. Yeah, I’m aware those views are contested, but then, what the hell isn’t when it comes to philosophy. There are philosophers who argue for post-intentionalism, the view that our words, language and thoughts aren’t actually about anything, for crying out loud.