You may be right about that. Still, I don’t see any better alternative. We’re apes with too much power already, and we’re getting more powerful by the minute. Even without AGI, there are plenty of ways to end humanity (e.g. bioweapons, nanobots, nuclear war, bio lab accidents …) Either we learn to overcome our ape-brain impulses and restrict ourselves, or we’ll kill ourselves. As long as we haven’t killed ourselves, I’ll push towards the first option.
Well, yes, of course! Why didn’t I think of it myself? /s
Honestly, “aligned benevolent AI” is not a “better alternative” for the problem I’m writing about in this post, which is we’ll be able to develop an AGI before we have solved alignment. I’m totally fine with someone building an aligned AGI (assuming that it is really aligend, not just seemingly aligned). The problem is, this is very hard to do, and timelines are likely very short.
At least 2 options to develop aligned AGI, in the context of this discussion:
Slow down capabilities and speed up alignment just enough that we solve alignment before developing AGI
e.g. the MTAIR project, in this paper, models the effect of a fire alarm for HLMI as “extra time” as speeding up safety research, leading to a higher chance that it is successful before timeline for HLMI
this seems intuitively more feasible, hence more likely
Stop capabilities altogether—this is what you’re recommending in the OP
this seems intuitively far less feasible, hence ~unlikely (I interpret e.g. HarrisonDurland’s comment as elaborating on this intuition)
What I don’t yet understand is why you’re pushing for #2 over #1. You would probably more persuasive if you addressed e.g. why my intuition that #1 is more feasible than #2 is wrong.
Edited to add: Matthijs Maas’ Strategic Perspectives on Transformative AI Governance: Introduction has this (oversimplified) mapping of strategic perspectives. I think you’d probably fall under (technical: pessimistic or very; governance: very optimistic), while my sense is most LWers (me included) are either pessimistic or uncertain on both axes, so there’s that inferential gap to address in the OP.
I’m obviously all for “slowing down capabilites”. I’m not for “stopping capabilities altogether”, but for selecting which capabilites we want to develop, and which to avoid (e.g. strategic awareness). I’m totally for “solving alignment before AGI” if that’s possible.
I’m very pessimistic about technical alignment in the near term, but not “optimistic” about governance. “Death with dignity” is not really a strategy, though. If anything, my favorite strategy in the table is “improve competence, institutions, norms, trust, and tools, to set the stage for right decisions”: If we can create a common understanding that developing a misaligned AGI would be really stupid, maybe the people who have access to the necessary technology won’t do it, at least for a while.
The point of my post here is not to solve the whole problem. I just want to point out that the common “either AGI or bad future” is wrong.
Sure, I mostly agree. To repeat part of my earlier comment, you would probably more persuasive if you addressed e.g. why my intuition that #1 is more feasible than #2 is wrong. In other words, I’m giving you feedback on how to make your post more persuasive to the LW audience. This sort of response (“Well, yes, of course! Why didn’t I think of it myself? /s”) doesn’t really persuade readers; bridging inferential gaps would.
You may be right about that. Still, I don’t see any better alternative. We’re apes with too much power already, and we’re getting more powerful by the minute. Even without AGI, there are plenty of ways to end humanity (e.g. bioweapons, nanobots, nuclear war, bio lab accidents …) Either we learn to overcome our ape-brain impulses and restrict ourselves, or we’ll kill ourselves. As long as we haven’t killed ourselves, I’ll push towards the first option.
I do! Aligned benevolent AI!
Well, yes, of course! Why didn’t I think of it myself? /s
Honestly, “aligned benevolent AI” is not a “better alternative” for the problem I’m writing about in this post, which is we’ll be able to develop an AGI before we have solved alignment. I’m totally fine with someone building an aligned AGI (assuming that it is really aligend, not just seemingly aligned). The problem is, this is very hard to do, and timelines are likely very short.
At least 2 options to develop aligned AGI, in the context of this discussion:
Slow down capabilities and speed up alignment just enough that we solve alignment before developing AGI
e.g. the MTAIR project, in this paper, models the effect of a fire alarm for HLMI as “extra time” as speeding up safety research, leading to a higher chance that it is successful before timeline for HLMI
this seems intuitively more feasible, hence more likely
Stop capabilities altogether—this is what you’re recommending in the OP
this seems intuitively far less feasible, hence ~unlikely (I interpret e.g. HarrisonDurland’s comment as elaborating on this intuition)
What I don’t yet understand is why you’re pushing for #2 over #1. You would probably more persuasive if you addressed e.g. why my intuition that #1 is more feasible than #2 is wrong.
Edited to add: Matthijs Maas’ Strategic Perspectives on Transformative AI Governance: Introduction has this (oversimplified) mapping of strategic perspectives. I think you’d probably fall under (technical: pessimistic or very; governance: very optimistic), while my sense is most LWers (me included) are either pessimistic or uncertain on both axes, so there’s that inferential gap to address in the OP.
I’m obviously all for “slowing down capabilites”. I’m not for “stopping capabilities altogether”, but for selecting which capabilites we want to develop, and which to avoid (e.g. strategic awareness). I’m totally for “solving alignment before AGI” if that’s possible.
I’m very pessimistic about technical alignment in the near term, but not “optimistic” about governance. “Death with dignity” is not really a strategy, though. If anything, my favorite strategy in the table is “improve competence, institutions, norms, trust, and tools, to set the stage for right decisions”: If we can create a common understanding that developing a misaligned AGI would be really stupid, maybe the people who have access to the necessary technology won’t do it, at least for a while.
The point of my post here is not to solve the whole problem. I just want to point out that the common “either AGI or bad future” is wrong.
Sure, I mostly agree. To repeat part of my earlier comment, you would probably more persuasive if you addressed e.g. why my intuition that #1 is more feasible than #2 is wrong. In other words, I’m giving you feedback on how to make your post more persuasive to the LW audience. This sort of response (“Well, yes, of course! Why didn’t I think of it myself? /s”) doesn’t really persuade readers; bridging inferential gaps would.
Good point! Satirical reactions are not appropriate in comments, I apologize. However, I don’t think that arguing why alignment is difficult would fit into this post. I clearly stated this assumption in the introduction as a basis for my argument, assuming that LW readers were familiar with the problem. Here are some resources to explain why I don’t think that we can solve alignment in the next 5-10 years: https://intelligence.org/2016/12/28/ai-alignment-why-its-hard-and-where-to-start/, https://aisafety.info?state=6172_, https://www.lesswrong.com/s/TLSzP4xP42PPBctgw/p/3gAccKDW6nRKFumpP