No independence or misalignment or even uncertainty of goals can exist in such a picture, and I’ll pre-commit to finding the weakness that brings the whole thing down, just for pure orneriness.
Really. Let’s paint a picture. Let’s imagine a superintelligent AI. The superintelligence has a goal. Implicitly defined in the form of a function that takes in the whole quantum wavefunction of the universe and outputs a number. Whether a particular action is good or bad depends on the answer to many factual questions, some it is unsure about. When the AI only has a rough idea that cows exist, it is implicitly considering a vast space of possible arangements of atoms that might comprise cows. The AI needs to find out quite a lot of specific facts about cow neurochemistry before it can determine whether cows have any moral value. And maybe it needs to consider not just the cow’s neurochemistry, but what every intelligent being in the universe would think, if hypothetically they were asked about the cow. Of course, the AI can’t compute this directly, so it is in the state of logical uncertainty as well as physical uncertainty.
The AI supports a utopia full of humans. Those humans have a huge range of different values. Some of those humans seem to mainly value making art all day. Some are utilitarian. Some follow virtue ethics. Some personal hedonism with wireheading. A population possibly quite neurodiverse compared to current humanity, except that the AI prevents anyone actively evil from being born.
Note that this prevents improvements as much as it prevents degradation.
If you can actually specify any way, however indirect and meta, to separate improvements from degradation, you can add that to your utility function.
I can’t follow your example. Does the AI have a goal in terms of the quantum wavefunction of the universe, or a goal in terms of abstractions like “cow neurochemistry”? But either way, is this utopia full of non-aligned, but not “actively evil” humans just another modeled and controlled part of the wavefunction, or are they agents with goals of their own (and if so, how does the AI aggregate those into it’s own)?
And more importantly for the post, what does any of that have to do with non-causal-path coordination?
The AI has a particular python program, which, if it were given the full quantum wave function and unlimited compute, would output a number. There are subroutines in that program that could reasonably described as looking at “cow neurochemistry”. The AI’s goals may involve such abstractions, but only if rules say how such goal is built out of quarks in its utility function. Or it may be using totally different abstractions, or no abstractions at all, yet be looking at something we would recognize as “cow neurochemistry”.
But either way, is this utopia full of non-aligned, but not “actively evil” humans just another modeled and controlled part of the wavefunction, or are they agents with goals of their own
Of course they are modeled, and somewhat contolled. And of course they are real agents with goals of their own. Various people are trying to model and control you now. Sure, the models and control are crude compared to what an AI would have, but that doesn’t stop you being real.
This doesn’t have that much to do with far coordination. I was disagreeing with your view that “locked in goals” implies a drab chained up “ant like” dystopia.
Really. Let’s paint a picture. Let’s imagine a superintelligent AI. The superintelligence has a goal. Implicitly defined in the form of a function that takes in the whole quantum wavefunction of the universe and outputs a number. Whether a particular action is good or bad depends on the answer to many factual questions, some it is unsure about. When the AI only has a rough idea that cows exist, it is implicitly considering a vast space of possible arangements of atoms that might comprise cows. The AI needs to find out quite a lot of specific facts about cow neurochemistry before it can determine whether cows have any moral value. And maybe it needs to consider not just the cow’s neurochemistry, but what every intelligent being in the universe would think, if hypothetically they were asked about the cow. Of course, the AI can’t compute this directly, so it is in the state of logical uncertainty as well as physical uncertainty.
The AI supports a utopia full of humans. Those humans have a huge range of different values. Some of those humans seem to mainly value making art all day. Some are utilitarian. Some follow virtue ethics. Some personal hedonism with wireheading. A population possibly quite neurodiverse compared to current humanity, except that the AI prevents anyone actively evil from being born.
If you can actually specify any way, however indirect and meta, to separate improvements from degradation, you can add that to your utility function.
I can’t follow your example. Does the AI have a goal in terms of the quantum wavefunction of the universe, or a goal in terms of abstractions like “cow neurochemistry”? But either way, is this utopia full of non-aligned, but not “actively evil” humans just another modeled and controlled part of the wavefunction, or are they agents with goals of their own (and if so, how does the AI aggregate those into it’s own)?
And more importantly for the post, what does any of that have to do with non-causal-path coordination?
The AI has a particular python program, which, if it were given the full quantum wave function and unlimited compute, would output a number. There are subroutines in that program that could reasonably described as looking at “cow neurochemistry”. The AI’s goals may involve such abstractions, but only if rules say how such goal is built out of quarks in its utility function. Or it may be using totally different abstractions, or no abstractions at all, yet be looking at something we would recognize as “cow neurochemistry”.
Of course they are modeled, and somewhat contolled. And of course they are real agents with goals of their own. Various people are trying to model and control you now. Sure, the models and control are crude compared to what an AI would have, but that doesn’t stop you being real.
This doesn’t have that much to do with far coordination. I was disagreeing with your view that “locked in goals” implies a drab chained up “ant like” dystopia.