1) Oh, sorry, what I meant was, the generals in Country A want their AGI to help them “win the war”, even if it involves killing people in Country B + innocent bystanders. And vice-versa for Country B. And then, between the efforts of both AGIs, the humans are all dead. But nothing here was either an “AGI accident unintended-by-the-designers behavior” nor “AGI misuse” by my definitions.
But anyway, yes I can imagine situations where it’s unclear whether “the AGI does things specifically intended by its designers”. That’s why I said “pretty solid” and not “rock solid” :) I think we probably disagree about whether these situations are the main thing we should be talking about, versus edge-cases we can put aside most of the time. From my perspective, they’re edge-cases. For example, the scenarios where a power-seeking AGI kills everyone are clearly on the “unintended” side of the (imperfect) dichotomy. But I guess it’s fine that other people are focused on different problems from me, and that “intent-alignment is poorly defined” may be a more central consideration for them. ¯\_(ツ)_/¯
3) I like your “note on terminology post”. But I also think of myself as subscribing to “the conventional framing of AI alignment”. I’m kinda confused that you see the former as counter to the latter.
2) From our current position, I think “never ever create AGI” is a significantly easier thing to coordinate around than “don’t build AGI until/unless we can do it safely”. I’m not very worried that we will coordinate too successfully and never build AGI and thus squander the cosmic endowment…
If you’re working on that, then I wish you luck! It does seem maybe feasible to buy some time. It doesn’t seem feasible to put off AGI forever. (But I’m not an expert.) It seems you agree.
I think creating such a manual is an incredibly ambitious goal, and I think more people in this community should aim for more moderate goals.
Obviously the manual will not be written by one person, and obviously some parts of the manual will not be written until the endgame, where we know more about AGI than we do today. But we can still try to make as much progress on the manual as we can, right?
The post you linked says “alignment is not enough”, which I see as obviously true, but that post doesn’t say “alignment is not necessary”. So, we still need that manual, right?
Delaying AGI forever would obviate the need for a manual, but it sounds like you’re only hoping to buy time—in which case, at some point we’re still gonna need that manual, right?
It’s possible that some ways to build AGI are safer than others. Having that information would be very useful, because then we can try to coordinate around never building AGI in the dangerous ways. And that’s exactly the kind of information that one would find in the manual, right?
1) Oh, sorry, what I meant was, the generals in Country A want their AGI to help them “win the war”, even if it involves killing people in Country B + innocent bystanders. And vice-versa for Country B. And then, between the efforts of both AGIs, the humans are all dead. But nothing here was either an “AGI
accidentunintended-by-the-designers behavior” nor “AGI misuse” by my definitions.But anyway, yes I can imagine situations where it’s unclear whether “the AGI does things specifically intended by its designers”. That’s why I said “pretty solid” and not “rock solid” :) I think we probably disagree about whether these situations are the main thing we should be talking about, versus edge-cases we can put aside most of the time. From my perspective, they’re edge-cases. For example, the scenarios where a power-seeking AGI kills everyone are clearly on the “unintended” side of the (imperfect) dichotomy. But I guess it’s fine that other people are focused on different problems from me, and that “intent-alignment is poorly defined” may be a more central consideration for them. ¯\_(ツ)_/¯
3) I like your “note on terminology post”. But I also think of myself as subscribing to “the conventional framing of AI alignment”. I’m kinda confused that you see the former as counter to the latter.
If you’re working on that, then I wish you luck! It does seem maybe feasible to buy some time. It doesn’t seem feasible to put off AGI forever. (But I’m not an expert.) It seems you agree.
Obviously the manual will not be written by one person, and obviously some parts of the manual will not be written until the endgame, where we know more about AGI than we do today. But we can still try to make as much progress on the manual as we can, right?
The post you linked says “alignment is not enough”, which I see as obviously true, but that post doesn’t say “alignment is not necessary”. So, we still need that manual, right?
Delaying AGI forever would obviate the need for a manual, but it sounds like you’re only hoping to buy time—in which case, at some point we’re still gonna need that manual, right?
It’s possible that some ways to build AGI are safer than others. Having that information would be very useful, because then we can try to coordinate around never building AGI in the dangerous ways. And that’s exactly the kind of information that one would find in the manual, right?