This is an example of what EY is talking about I think—as far as I can tell all the obvious things one would do to reduce s-risk via increasing x-risk are the sort of supervillian schemes that are more likely to increase s-risk than decrease it once secondary effects and unintended consequences etc. are taken into account. This is partly why I put the “with dignity” qualifier in. (The other reason is that I’m not a utilitarian and don’t think our decision about whether to do supervillian schemes should come down to whether we think the astronomical long-term consequences are slightly more likely to be positive than negative.)
Suppose, for example, that you’re going to try to build an AGI anyway. You could just not try to train it to care about human values, hoping that it would destroy the world, rather than creating some kind of crazy mind-control dystopia.
I submit that, if your model of the universe is that AGI will, by default, be a huge x-risk and/or a huge s-risk, then the “supervillain” step in that process would be deciding to build it in the first place, and not necessarily not trying to “align” it. You lost your dignity at the first step, and won’t lose any more at the second.
Also, I kind of hate to say it, but sometimes the stuff about “secondary effects and unintended consequences” sounds more like “I’m looking for reasons not to break widely-loved deontological rules, regardless of my professed ethical system, because I am uncomfortable with breaking those rules” than like actual caution. It’s very easy to stop looking for more effects in either direction when you reach the conclusion you want.
I mean, yes, those deontological rules are useful time-tested heuristics. Yes, a lot of the time the likely consequences of violating them will be bad in clearly foreseeable ways. Yes, you are imperfect and should also be very, very nervous about consequences you do not foresee. But all of that can also act as convenient cover for switching from being an actual utilitarian to being an actual deontologist, without ever saying as much.
Personally, I’m neither. And I also don’t believe that intelligence, in any actually achievable quantity, is a magic wand that automatically lets you either destroy the world or take over and torture everybody. And I very much doubt that ML-as-presently-practiced, without serious structural innovations and running on physically realizable computers, will get all that smart anyway. So I don’t really have an incentive to get all supervillainy to begin with. And I wouldn’t be good at it anyhow.
… but if faced with a choice between a certainty of destroying the world, and a certainty of every living being being tortured for eternity, even I would go with the “destroy” option.
I think we are on the same page here. I would recommend not creating AGI at all in that situation, but I agree that creating a completely unaligned one is better than creating an s-risky one. https://arbital.com/p/hyperexistential_separation/
This is an example of what EY is talking about I think—as far as I can tell all the obvious things one would do to reduce s-risk via increasing x-risk are the sort of supervillian schemes that are more likely to increase s-risk than decrease it once secondary effects and unintended consequences etc. are taken into account. This is partly why I put the “with dignity” qualifier in. (The other reason is that I’m not a utilitarian and don’t think our decision about whether to do supervillian schemes should come down to whether we think the astronomical long-term consequences are slightly more likely to be positive than negative.)
Suppose, for example, that you’re going to try to build an AGI anyway. You could just not try to train it to care about human values, hoping that it would destroy the world, rather than creating some kind of crazy mind-control dystopia.
I submit that, if your model of the universe is that AGI will, by default, be a huge x-risk and/or a huge s-risk, then the “supervillain” step in that process would be deciding to build it in the first place, and not necessarily not trying to “align” it. You lost your dignity at the first step, and won’t lose any more at the second.
Also, I kind of hate to say it, but sometimes the stuff about “secondary effects and unintended consequences” sounds more like “I’m looking for reasons not to break widely-loved deontological rules, regardless of my professed ethical system, because I am uncomfortable with breaking those rules” than like actual caution. It’s very easy to stop looking for more effects in either direction when you reach the conclusion you want.
I mean, yes, those deontological rules are useful time-tested heuristics. Yes, a lot of the time the likely consequences of violating them will be bad in clearly foreseeable ways. Yes, you are imperfect and should also be very, very nervous about consequences you do not foresee. But all of that can also act as convenient cover for switching from being an actual utilitarian to being an actual deontologist, without ever saying as much.
Personally, I’m neither. And I also don’t believe that intelligence, in any actually achievable quantity, is a magic wand that automatically lets you either destroy the world or take over and torture everybody. And I very much doubt that ML-as-presently-practiced, without serious structural innovations and running on physically realizable computers, will get all that smart anyway. So I don’t really have an incentive to get all supervillainy to begin with. And I wouldn’t be good at it anyhow.
… but if faced with a choice between a certainty of destroying the world, and a certainty of every living being being tortured for eternity, even I would go with the “destroy” option.
I think we are on the same page here. I would recommend not creating AGI at all in that situation, but I agree that creating a completely unaligned one is better than creating an s-risky one. https://arbital.com/p/hyperexistential_separation/