if we encountered an agent who wanted to maximise paperclips today, we wouldn’t think, “”wow, how incomprehensibly alien”
Agreed, as far as it goes. Hell, humans are demonstrably capable of encountering Eliza programs without thinking “wow, how incomprehensibly alien”.
Mind you, we’re mistaken: Eliza programs are incomprehensibly alien, we haven’t the first clue what it feels like to be one, supposing it even feels like anything at all. But that doesn’t stop us from thinking otherwise.
but, “aha, autism spectrum disorder”.
Sure, that’s one thing we might think instead. Agreed.
we’re assuming a hypothetical axis of (un)clippiness whose (dis)valuable nature is supposedly orthogonal to the pleasure-pain axis. But what grounds have we for believing such a qualia-space could exist?
(shrug) I’m content to start off by saying that any “axis of (dis)value,” whatever that is, which is capable of motivating behavior is “non-orthogonal,” whatever that means in this context, to “the pleasure-pain axis,” whatever that is.
Before going much further, though, I’d want some confidence that we were able to identify an observed system as being (or at least being reliably related to) an axis of (dis)value and able to determine, upon encountering such a thing, whether it (or the axis to which it was related) was orthogonal to the pleasure-pain axis or not.
I don’t currently have any grounds for such confidence, and I doubt anyone else does either. If you think you do, I’d like to understand how you would go about making such determinations about an observed system.
Agreed, as far as it goes. Hell, humans are demonstrably capable of encountering Eliza programs without thinking “wow, how incomprehensibly alien”.
Mind you, we’re mistaken: Eliza programs are incomprehensibly alien, we haven’t the first clue what it feels like to be one, supposing it even feels like anything at all. But that doesn’t stop us from thinking otherwise.
Sure, that’s one thing we might think instead. Agreed.
(shrug) I’m content to start off by saying that any “axis of (dis)value,” whatever that is, which is capable of motivating behavior is “non-orthogonal,” whatever that means in this context, to “the pleasure-pain axis,” whatever that is.
Before going much further, though, I’d want some confidence that we were able to identify an observed system as being (or at least being reliably related to) an axis of (dis)value and able to determine, upon encountering such a thing, whether it (or the axis to which it was related) was orthogonal to the pleasure-pain axis or not.
I don’t currently have any grounds for such confidence, and I doubt anyone else does either. If you think you do, I’d like to understand how you would go about making such determinations about an observed system.