Formal logic does seem very powerful, yet incomplete. Would you be willing to create an AI with such limited understanding of math or morality (assuming we can formalize an understanding of morality on par with math), given that it could well obtain supervisory power over humanity? One might justify it by arguing that it’s better than the alternative of trying to achieve and capture fuller understanding, which would involve further delay and risk. See for example Tim Freeman’s argument in this line, or my own.
Another alternative is to build an upload-based FAI instead, like Stuart Armstrong’s recent proposal. That is, use uploads as components in a larger system, with lots of safety checks. In a way Eliezer’s FAI ideas can also be seen as heavily upload based, since CEV can be interpreted (as you did before) as uploads with safety checks. (So the question I’m asking can be phrased as, instead of just punting normative ethics to CEV, why not punt all of meta-math, decision theory, meta-ethics, etc., to a CEV-like construct?)
Of course you’re probably just as unsure of these issues as I am, but I’m curious what your current thoughts are.
Humans are also incomplete in this sense. We already have no way of capturing the whole problem statement. The goal is to capture it as well as possible using some reflective trick of looking at our own brains or behavior, which is probably way better than what an upload singleton that doesn’t build a FAI is capable of.
If there are uploads, they could be handed the task of solving the problem of FAI in the same sense in which we try to, but this doesn’t get us any closer to the solution. There should probably be a charity dedicated to designing upload-based singletons as a kind of high-impact applied normative ethics effort (and SIAI might want to spawn one, since rational thinking about morality is important for this task; we don’t want fatalistic acceptance of a possible Malthusian dystopia or unchecked moral drift), but this is not the same problem as FAI.
Humans are at least capable of making some philosophical progress, and until we solve meta-philosophy, no de novo AI is. Assuming that we don’t solve meta-philosophy first, any de novo AIs we build will be more incomplete than humans. Do you agree?
If there are uploads, they could be handed the task of solving the problem of FAI in the same sense in which we try to, but this doesn’t get us any closer to the solution.
It gets closer to the solution in the sense that there is no longer a time pressure, since it’s easier for an upload-singleton to ensure their own value stability, and they don’t have to worry about people building uFAIs and other existential risks while they work on FAI. They can afford to try harder to get to the right solution than we can.
It gets closer to the solution in the sense that there is no longer a time pressure, since it’s easier for an upload-singleton to ensure their own value stability, and they don’t have to worry about people building uFAIs and other existential risks while they work on FAI. They can afford to try harder to get to the right solution than we can.
There is a time pressure from existential risk (also, astronomical waste). Just as in FAI vs. AGI race, we would have a race between FAI-building and AGI-building uploads (in the sense of “who runs first”, but also literally while restricted by speed and costs). And fast-running uploads pose other risks as well, for example they could form an unfriendly singleton without even solving AGI, or build runaway nanotech.
(Planning to make sure that we run a prepared upload FAI team before a singleton of any other nature can prevent it is an important contingency, someone should get on that in the coming decades, and better metaethical theory and rationality education can help in that task.)
I should have made myself clearer. What I meant was assuming that an organization interested in building FAI can first achieve an upload-singleton, it won’t be facing competition from other uploads (since that’s what “singleton” means). It will be facing significantly less time pressure than a similar organization trying to build FAI directly. (Delay will still cause astronomical waste due to physical resources falling away into event horizons and the like, but that seems negligible compared to the existential risks that we face now.)
What I meant was assuming that an organization interested in building FAI can first achieve an upload-singleton, it won’t be facing competition from other uploads.
But this assumption is rather unlikely/difficult to implement, so in the situation where we count on it, we’ve already lost a large portion of the future. Also, this course of action (unlikely to succeed as it is in any case) significantly benefits from massive funding to buy computational resources, which is a race. The other alternative, which is educating people in a way that increases the chances of a positive upload-driven outcome, is also a race, for development of better understanding of metaethics/rationality and for educating more people better.
Humans are at least capable of making some philosophical progress, and until we solve meta-philosophy, no de novo AI is.
Philosophical progress is just a special kind of physical action that we can perform, valuable for abstract reasons that feed into what constitutes our values. I don’t see how this feature is fundamentally different from pointing to any other complicated aspect of human values and saying that AI must be able to make that distinction or destroy all value with its mining claws. Of course it must.
Formal logic does seem very powerful, yet incomplete. Would you be willing to create an AI with such limited understanding of math or morality (assuming we can formalize an understanding of morality on par with math), given that it could well obtain supervisory power over humanity? One might justify it by arguing that it’s better than the alternative of trying to achieve and capture fuller understanding, which would involve further delay and risk. See for example Tim Freeman’s argument in this line, or my own.
Another alternative is to build an upload-based FAI instead, like Stuart Armstrong’s recent proposal. That is, use uploads as components in a larger system, with lots of safety checks. In a way Eliezer’s FAI ideas can also be seen as heavily upload based, since CEV can be interpreted (as you did before) as uploads with safety checks. (So the question I’m asking can be phrased as, instead of just punting normative ethics to CEV, why not punt all of meta-math, decision theory, meta-ethics, etc., to a CEV-like construct?)
Of course you’re probably just as unsure of these issues as I am, but I’m curious what your current thoughts are.
Humans are also incomplete in this sense. We already have no way of capturing the whole problem statement. The goal is to capture it as well as possible using some reflective trick of looking at our own brains or behavior, which is probably way better than what an upload singleton that doesn’t build a FAI is capable of.
If there are uploads, they could be handed the task of solving the problem of FAI in the same sense in which we try to, but this doesn’t get us any closer to the solution. There should probably be a charity dedicated to designing upload-based singletons as a kind of high-impact applied normative ethics effort (and SIAI might want to spawn one, since rational thinking about morality is important for this task; we don’t want fatalistic acceptance of a possible Malthusian dystopia or unchecked moral drift), but this is not the same problem as FAI.
Humans are at least capable of making some philosophical progress, and until we solve meta-philosophy, no de novo AI is. Assuming that we don’t solve meta-philosophy first, any de novo AIs we build will be more incomplete than humans. Do you agree?
It gets closer to the solution in the sense that there is no longer a time pressure, since it’s easier for an upload-singleton to ensure their own value stability, and they don’t have to worry about people building uFAIs and other existential risks while they work on FAI. They can afford to try harder to get to the right solution than we can.
There is a time pressure from existential risk (also, astronomical waste). Just as in FAI vs. AGI race, we would have a race between FAI-building and AGI-building uploads (in the sense of “who runs first”, but also literally while restricted by speed and costs). And fast-running uploads pose other risks as well, for example they could form an unfriendly singleton without even solving AGI, or build runaway nanotech.
(Planning to make sure that we run a prepared upload FAI team before a singleton of any other nature can prevent it is an important contingency, someone should get on that in the coming decades, and better metaethical theory and rationality education can help in that task.)
I should have made myself clearer. What I meant was assuming that an organization interested in building FAI can first achieve an upload-singleton, it won’t be facing competition from other uploads (since that’s what “singleton” means). It will be facing significantly less time pressure than a similar organization trying to build FAI directly. (Delay will still cause astronomical waste due to physical resources falling away into event horizons and the like, but that seems negligible compared to the existential risks that we face now.)
But this assumption is rather unlikely/difficult to implement, so in the situation where we count on it, we’ve already lost a large portion of the future. Also, this course of action (unlikely to succeed as it is in any case) significantly benefits from massive funding to buy computational resources, which is a race. The other alternative, which is educating people in a way that increases the chances of a positive upload-driven outcome, is also a race, for development of better understanding of metaethics/rationality and for educating more people better.
Philosophical progress is just a special kind of physical action that we can perform, valuable for abstract reasons that feed into what constitutes our values. I don’t see how this feature is fundamentally different from pointing to any other complicated aspect of human values and saying that AI must be able to make that distinction or destroy all value with its mining claws. Of course it must.