Thanks. In my imagination, the AI does some altruistic work, but spends most of its resources justifying the total expenditure. In that way, it would be similar to cults that do some charitable work, but spend most of their resources brainwashing people. But “rogue lawyer” is probably a better analogy than “cult guru” because the arguments are openly released. The AI develops models of human brain types in increasingly detailed resolutions, and then searches over attractive philosophies and language patterns, allowing it to accumulate considerable power despite its openness. It shifts the focus to justifiability only because it discovers that beyond a certain point, finding maximally justifiable arguments is much harder than being altruistic, and justifiability is its highest priority. But it always finds the maximally justifiable course of action first, and then takes that course of action. So it continues to be minimally altruistic throughout, making it a cult guru that is so good at its work it doesn’t need to use extreme tactics. This is why losing the AI is like exiting a cult, except the entire world of subjective meaning feels like a cult ideology afterwards.
Oh, now I understand the moral dilemma. Something like an Ineffective Friendly AI, which uses sqrt(x) or even log(x) resources for doing actually Friendly things, and the rest of them are wasted on doing something that is not really harmful, just completely useless; with no perspective to ever become more effective.
Would you turn that off? And perhaps risk that the next AI will turn out not to be Friendly, or it will be Friendly, but even more wasteful than the old one, however better at defending itself. Or would you let it run and accept that the price is turning most of the universe into bullshitronium?
I guess for a story it is a good thing when both sides can be morally defended.
Thanks. Yes, I was thinking of an AI that is both superintelligent and technically Friendly, but about log(x)^10 of the benefit from the intelligence explosion is actually received by humans. The AI just sets up its own cult and meditates for most of the day, thinking of how to wring more money out of its adoring fans. Are there ways to set up theoretical frameworks that avoid scenarios vaguely similar to that? If so, how?
Thanks. In my imagination, the AI does some altruistic work, but spends most of its resources justifying the total expenditure. In that way, it would be similar to cults that do some charitable work, but spend most of their resources brainwashing people. But “rogue lawyer” is probably a better analogy than “cult guru” because the arguments are openly released. The AI develops models of human brain types in increasingly detailed resolutions, and then searches over attractive philosophies and language patterns, allowing it to accumulate considerable power despite its openness. It shifts the focus to justifiability only because it discovers that beyond a certain point, finding maximally justifiable arguments is much harder than being altruistic, and justifiability is its highest priority. But it always finds the maximally justifiable course of action first, and then takes that course of action. So it continues to be minimally altruistic throughout, making it a cult guru that is so good at its work it doesn’t need to use extreme tactics. This is why losing the AI is like exiting a cult, except the entire world of subjective meaning feels like a cult ideology afterwards.
This could also be a metaphor for politicians, or depending on your worldview, marketing-heavy businesses. Or religions.
Oh, now I understand the moral dilemma. Something like an Ineffective Friendly AI, which uses sqrt(x) or even log(x) resources for doing actually Friendly things, and the rest of them are wasted on doing something that is not really harmful, just completely useless; with no perspective to ever become more effective.
Would you turn that off? And perhaps risk that the next AI will turn out not to be Friendly, or it will be Friendly, but even more wasteful than the old one, however better at defending itself. Or would you let it run and accept that the price is turning most of the universe into bullshitronium?
I guess for a story it is a good thing when both sides can be morally defended.
Thanks. Yes, I was thinking of an AI that is both superintelligent and technically Friendly, but about log(x)^10 of the benefit from the intelligence explosion is actually received by humans. The AI just sets up its own cult and meditates for most of the day, thinking of how to wring more money out of its adoring fans. Are there ways to set up theoretical frameworks that avoid scenarios vaguely similar to that? If so, how?