What I’m doing here is not just restating a criticism, I am stating, in a very abbreviated form, an alternative.
Paul asked what my problem is with the scenario. Well, originally I thought he was proposing that in real life, we should rise to the challenge of Friendly AI by literally uploading the researchers. Then in your reply, you indicated that this scenario might also just function as a thought experiment, a first approximation to a presently unknown process that will reliably result in Friendly AI.
If someone is seriously suggesting that the path to FAI involves uploading the researchers, I say they are both severely underestimating the difficulty of achieving successful whole-brain emulation, and that they are going to be throwing away their precious time by working on the wrong problem (WBE rather than FAI), when they should be tackling FAI directly. A lot of the time, I think people imagine that in such a scenario, even WBE isn’t achieved by human effort, it’s just figured out by a not-yet-friendly general-problem-solving proto-AI. We think we have an idea of how to make AIs that produce causal models, so we just get this proto-AI to make a causal model of someone, and abracadabra, we have the upload we wanted.
What I find pernicious about this mode of thought is that certain problems—the details of how you “extrapolate the values” of a particular thinking system, the details of how a whole-brain emulation is produced—are entirely avoided; it is assumed that we will have access to human simulations smart enough to solve the first problem, and causal modellers powerful enough to solve the second problem (and thereby provide the human simulations that will be used to solve the first problem). In effect, we assume the existence of ultrapowerful functions makeCausalModel() and askAgentToSolveThisProblem(), and then the algorithm for solving FAI is something like
that is, one asks the agent produced as a causal model of one of the researchers to “solve this problem”.
It’s fun to write singularity-level pseudocode, and perhaps it will be scary when there are proto-AIs advanced enough to accept an instruction like that and start looking through their heuristics in search of a way to implement it. But I don’t consider it a serious engagement with the problem of FAI. Solving the problem of FAI is more akin to solving one of the Millennium Problems, it’s a big intellectual task which may be the work of a whole generation in several disciplines, but trying to achieve it by getting uploads to do it is asking for a post-singularity technology to help you achieve a singularity. The way to progress is to directly tackle the details of neuroscience and decision theory.
Regarding the other rationale, according to which we are not literally setting out to upload the researchers before they tackle the problem, we’re just telling an imaginary AI to act as those uploaded researchers would ultimately advise it to act, after their 500 subjective years in cyberspace… something like Cyc might be able to parse that command, and Cyc is a present reality where uploads are not, so there’s a fraction more realism here. But presumably the AI’s diagnostics would eventually tell you something like “need for computing substrate 10^9 more powerful than current processor detected”, that you could have figured out for yourself.
I see it as a little out of character for me to go bashing blue-sky speculation and flights of theoretical ingenuity. But as things stand, the attempt to define the perfectly safeguarded wish about who to upload, what to ask of them, and how to curate them in silico, seems to define how people are approaching the problem of FAI, and that’s ridiculous and unnecessary. There’s plenty to think about and to work on, if one sets out to directly tackle the problem of value extrapolation. This “direct” approach ought to define the mainstream of FAI research, not the indirect approach of “get the uploads to solve the problem”.
something like Cyc might be able to parse that command
Actually we might be able to explain that command to a math-only AI pretty soon. For example, in Paul’s “formal instructions” approach, you point the AI at a long bitstring generated by a human, and ask the AI to find the most likely program that generated that bitstring under the universal prior. As a result you get something functionally similar to an upload. There are many potential problems with this approach (e.g. how do you separate the human from the rest of the world?) but they seem like the sort of problems you can solve with clever tricks, not insurmountable obstacles. And there might be other ways to use math-only AIs to boost uploading, we’ve only been thinking about this for several months.
But presumably the AI’s diagnostics would eventually tell you something like “need for computing substrate 10^9 more powerful than current processor detected”, that you could have figured out for yourself.
A probabilistic answer will suffice, I think, and it doesn’t seem to require that much computing power. It’s suspicious that you can make educated guesses about what you’re going to think tomorrow, but a strong AI somehow cannot. I’d expect a strong AI to be better at answering math questions than a human.
ETA: if solving the object-level problem is a more promising approach than bootstrapping through the meta-problem, then I certainly want to believe that is the case. Right now I feel we only have a viable attack on the meta-problem. If you figure out a viable attack on the object-level problem, be sure to let us know!
What I’m doing here is not just restating a criticism, I am stating, in a very abbreviated form, an alternative.
Paul asked what my problem is with the scenario. Well, originally I thought he was proposing that in real life, we should rise to the challenge of Friendly AI by literally uploading the researchers. Then in your reply, you indicated that this scenario might also just function as a thought experiment, a first approximation to a presently unknown process that will reliably result in Friendly AI.
If someone is seriously suggesting that the path to FAI involves uploading the researchers, I say they are both severely underestimating the difficulty of achieving successful whole-brain emulation, and that they are going to be throwing away their precious time by working on the wrong problem (WBE rather than FAI), when they should be tackling FAI directly. A lot of the time, I think people imagine that in such a scenario, even WBE isn’t achieved by human effort, it’s just figured out by a not-yet-friendly general-problem-solving proto-AI. We think we have an idea of how to make AIs that produce causal models, so we just get this proto-AI to make a causal model of someone, and abracadabra, we have the upload we wanted.
What I find pernicious about this mode of thought is that certain problems—the details of how you “extrapolate the values” of a particular thinking system, the details of how a whole-brain emulation is produced—are entirely avoided; it is assumed that we will have access to human simulations smart enough to solve the first problem, and causal modellers powerful enough to solve the second problem (and thereby provide the human simulations that will be used to solve the first problem). In effect, we assume the existence of ultrapowerful functions makeCausalModel() and askAgentToSolveThisProblem(), and then the algorithm for solving FAI is something like
askAgentToSolveThisProblem(makeCausalModel(researcherInstance),problemOfFAI);
that is, one asks the agent produced as a causal model of one of the researchers to “solve this problem”.
It’s fun to write singularity-level pseudocode, and perhaps it will be scary when there are proto-AIs advanced enough to accept an instruction like that and start looking through their heuristics in search of a way to implement it. But I don’t consider it a serious engagement with the problem of FAI. Solving the problem of FAI is more akin to solving one of the Millennium Problems, it’s a big intellectual task which may be the work of a whole generation in several disciplines, but trying to achieve it by getting uploads to do it is asking for a post-singularity technology to help you achieve a singularity. The way to progress is to directly tackle the details of neuroscience and decision theory.
Regarding the other rationale, according to which we are not literally setting out to upload the researchers before they tackle the problem, we’re just telling an imaginary AI to act as those uploaded researchers would ultimately advise it to act, after their 500 subjective years in cyberspace… something like Cyc might be able to parse that command, and Cyc is a present reality where uploads are not, so there’s a fraction more realism here. But presumably the AI’s diagnostics would eventually tell you something like “need for computing substrate 10^9 more powerful than current processor detected”, that you could have figured out for yourself.
I see it as a little out of character for me to go bashing blue-sky speculation and flights of theoretical ingenuity. But as things stand, the attempt to define the perfectly safeguarded wish about who to upload, what to ask of them, and how to curate them in silico, seems to define how people are approaching the problem of FAI, and that’s ridiculous and unnecessary. There’s plenty to think about and to work on, if one sets out to directly tackle the problem of value extrapolation. This “direct” approach ought to define the mainstream of FAI research, not the indirect approach of “get the uploads to solve the problem”.
Actually we might be able to explain that command to a math-only AI pretty soon. For example, in Paul’s “formal instructions” approach, you point the AI at a long bitstring generated by a human, and ask the AI to find the most likely program that generated that bitstring under the universal prior. As a result you get something functionally similar to an upload. There are many potential problems with this approach (e.g. how do you separate the human from the rest of the world?) but they seem like the sort of problems you can solve with clever tricks, not insurmountable obstacles. And there might be other ways to use math-only AIs to boost uploading, we’ve only been thinking about this for several months.
A probabilistic answer will suffice, I think, and it doesn’t seem to require that much computing power. It’s suspicious that you can make educated guesses about what you’re going to think tomorrow, but a strong AI somehow cannot. I’d expect a strong AI to be better at answering math questions than a human.
ETA: if solving the object-level problem is a more promising approach than bootstrapping through the meta-problem, then I certainly want to believe that is the case. Right now I feel we only have a viable attack on the meta-problem. If you figure out a viable attack on the object-level problem, be sure to let us know!