I think the most fundamental objection to becoming cyborgs is that we don’t know how to say whether a person retains control over the cyborg they become a part of.
I agree that this is important. Are you more concerned about cyborgs than other human-in-the-loop systems? To me the whole point is figuring out how to make systems where the human remains fully in control (unlike, e.g. delegating to agents), and so answering this “how to say whether a person retains control” question seems critical to doing that successfully.
Indeed. I think having a clean, well-understood interface for human/AI interaction seems useful here. I recognize this is a big ask in the current norms and rules around AI development and deployment.
I think that’s an important objection, but I see it applying almost entirely on a personal level. On the strategic level, I actually buy that this kind of augmentation (i.e. with in some sense passive AI) is not an alignment risk (any more than any technology is). My worry is the “dual use technology” section.
Like, I may not want to become a cyborg if I stop being me, but that’s a separate concern from whether it’s bad for alignment (if the resulting cyborg is still aligned).
I think “sufficiently” is doing a lot of work here. For example, are we talking about >99% chance that it kills <1% of humanity, or >50% chance that it kills <50% of humanity?
I also don’t think “something in the middle” is the right characterization; I think “something else” it more accurate. I think that the failure you’re pointing at will look less like a power struggle or akrasia and more like an emergent goal structure that wasn’t really present in either part.
I also think that “cyborg alignment” is in many ways a much more tractable problem than “AI alignment” (and in some ways even less tractable, because of pesky human psychology):
It’s a much more gradual problem; a misaligned cyborg (with no agentic AI components) is not directly capable of FOOM (Amdhal’s law was mentioned elsewhere in the comments as a limit on usefulness of cyborgism, but it’s also a limit on damage)
It has been studied longer and has existed longer; all technologies have influenced human thought
It also may be an important paradigm to study (even if we don’t actively create tools for it) because it’s already happening.
Historically I have felt most completely myself when I was intertwining my thoughts with those of an AI. And the most I’ve ever had access to is AI Dungeon, not GPT-3 itself. I feel more myself with it, not less—as if it’s opening up parts of my own mind I didn’t know were there before. But that’s me.
I think the most fundamental objection to becoming cyborgs is that we don’t know how to say whether a person retains control over the cyborg they become a part of.
I agree that this is important. Are you more concerned about cyborgs than other human-in-the-loop systems? To me the whole point is figuring out how to make systems where the human remains fully in control (unlike, e.g. delegating to agents), and so answering this “how to say whether a person retains control” question seems critical to doing that successfully.
Indeed. I think having a clean, well-understood interface for human/AI interaction seems useful here. I recognize this is a big ask in the current norms and rules around AI development and deployment.
I think that’s an important objection, but I see it applying almost entirely on a personal level. On the strategic level, I actually buy that this kind of augmentation (i.e. with in some sense passive AI) is not an alignment risk (any more than any technology is). My worry is the “dual use technology” section.
I don’t understand what you’re getting at RE “personal level”.
Like, I may not want to become a cyborg if I stop being me, but that’s a separate concern from whether it’s bad for alignment (if the resulting cyborg is still aligned).
Oh I see. I was getting at the “it’s not aligned” bit.
Basically, it seems like if I become a cyborg without understanding what I’m doing, the result is either:
I’m in control
The machine part is in control
Something in the middle
Only the first one seems likely to be sufficiently aligned.
I think “sufficiently” is doing a lot of work here. For example, are we talking about >99% chance that it kills <1% of humanity, or >50% chance that it kills <50% of humanity?
I also don’t think “something in the middle” is the right characterization; I think “something else” it more accurate. I think that the failure you’re pointing at will look less like a power struggle or akrasia and more like an emergent goal structure that wasn’t really present in either part.
I also think that “cyborg alignment” is in many ways a much more tractable problem than “AI alignment” (and in some ways even less tractable, because of pesky human psychology):
It’s a much more gradual problem; a misaligned cyborg (with no agentic AI components) is not directly capable of FOOM (Amdhal’s law was mentioned elsewhere in the comments as a limit on usefulness of cyborgism, but it’s also a limit on damage)
It has been studied longer and has existed longer; all technologies have influenced human thought
It also may be an important paradigm to study (even if we don’t actively create tools for it) because it’s already happening.
Historically I have felt most completely myself when I was intertwining my thoughts with those of an AI. And the most I’ve ever had access to is AI Dungeon, not GPT-3 itself. I feel more myself with it, not less—as if it’s opening up parts of my own mind I didn’t know were there before. But that’s me.