Too bad (or actually good) we can’t actually see those superintelligent arguments. I wonder which direction they would take.
The author should perhaps describe them indirectly, i.e. not quote them (because the author is not a superintelligence, and cannot write superintelligent arguments), but describe reactions of other people after reading them. Those other people should generally become convinced about the validity of the arguments (because in-universe the arguments are superintelligent), but that can happen gradually, so in the initial phases they can be just generally impressed “hey, it actually makes more sense than I expected originally”, and only after reading the whole document they would will become fully brainwashed (but perhaps not able to reproduce the argument in its full power, so they would urge the protagonist to read the original document). Random fragments of ideas can be thrown here and there, e.g. reported by people who read the superintelligent argument halfway. Perhaps the AI could quote Plato about how pure knowledge is the best knowledge (used as an excuse for why AI does not research something practical instead).
Thanks. In my imagination, the AI does some altruistic work, but spends most of its resources justifying the total expenditure. In that way, it would be similar to cults that do some charitable work, but spend most of their resources brainwashing people. But “rogue lawyer” is probably a better analogy than “cult guru” because the arguments are openly released. The AI develops models of human brain types in increasingly detailed resolutions, and then searches over attractive philosophies and language patterns, allowing it to accumulate considerable power despite its openness. It shifts the focus to justifiability only because it discovers that beyond a certain point, finding maximally justifiable arguments is much harder than being altruistic, and justifiability is its highest priority. But it always finds the maximally justifiable course of action first, and then takes that course of action. So it continues to be minimally altruistic throughout, making it a cult guru that is so good at its work it doesn’t need to use extreme tactics. This is why losing the AI is like exiting a cult, except the entire world of subjective meaning feels like a cult ideology afterwards.
Oh, now I understand the moral dilemma. Something like an Ineffective Friendly AI, which uses sqrt(x) or even log(x) resources for doing actually Friendly things, and the rest of them are wasted on doing something that is not really harmful, just completely useless; with no perspective to ever become more effective.
Would you turn that off? And perhaps risk that the next AI will turn out not to be Friendly, or it will be Friendly, but even more wasteful than the old one, however better at defending itself. Or would you let it run and accept that the price is turning most of the universe into bullshitronium?
I guess for a story it is a good thing when both sides can be morally defended.
Thanks. Yes, I was thinking of an AI that is both superintelligent and technically Friendly, but about log(x)^10 of the benefit from the intelligence explosion is actually received by humans. The AI just sets up its own cult and meditates for most of the day, thinking of how to wring more money out of its adoring fans. Are there ways to set up theoretical frameworks that avoid scenarios vaguely similar to that? If so, how?
Too bad (or actually good) we can’t actually see those superintelligent arguments. I wonder which direction they would take.
The author should perhaps describe them indirectly, i.e. not quote them (because the author is not a superintelligence, and cannot write superintelligent arguments), but describe reactions of other people after reading them. Those other people should generally become convinced about the validity of the arguments (because in-universe the arguments are superintelligent), but that can happen gradually, so in the initial phases they can be just generally impressed “hey, it actually makes more sense than I expected originally”, and only after reading the whole document they would will become fully brainwashed (but perhaps not able to reproduce the argument in its full power, so they would urge the protagonist to read the original document). Random fragments of ideas can be thrown here and there, e.g. reported by people who read the superintelligent argument halfway. Perhaps the AI could quote Plato about how pure knowledge is the best knowledge (used as an excuse for why AI does not research something practical instead).
Thanks. In my imagination, the AI does some altruistic work, but spends most of its resources justifying the total expenditure. In that way, it would be similar to cults that do some charitable work, but spend most of their resources brainwashing people. But “rogue lawyer” is probably a better analogy than “cult guru” because the arguments are openly released. The AI develops models of human brain types in increasingly detailed resolutions, and then searches over attractive philosophies and language patterns, allowing it to accumulate considerable power despite its openness. It shifts the focus to justifiability only because it discovers that beyond a certain point, finding maximally justifiable arguments is much harder than being altruistic, and justifiability is its highest priority. But it always finds the maximally justifiable course of action first, and then takes that course of action. So it continues to be minimally altruistic throughout, making it a cult guru that is so good at its work it doesn’t need to use extreme tactics. This is why losing the AI is like exiting a cult, except the entire world of subjective meaning feels like a cult ideology afterwards.
This could also be a metaphor for politicians, or depending on your worldview, marketing-heavy businesses. Or religions.
Oh, now I understand the moral dilemma. Something like an Ineffective Friendly AI, which uses sqrt(x) or even log(x) resources for doing actually Friendly things, and the rest of them are wasted on doing something that is not really harmful, just completely useless; with no perspective to ever become more effective.
Would you turn that off? And perhaps risk that the next AI will turn out not to be Friendly, or it will be Friendly, but even more wasteful than the old one, however better at defending itself. Or would you let it run and accept that the price is turning most of the universe into bullshitronium?
I guess for a story it is a good thing when both sides can be morally defended.
Thanks. Yes, I was thinking of an AI that is both superintelligent and technically Friendly, but about log(x)^10 of the benefit from the intelligence explosion is actually received by humans. The AI just sets up its own cult and meditates for most of the day, thinking of how to wring more money out of its adoring fans. Are there ways to set up theoretical frameworks that avoid scenarios vaguely similar to that? If so, how?