This is just my opinion, not particularly evidence-based: I don’t think that there are two different kinds of mind, or if there are it’s not this issue that separates them. The wireheading scenario is one which is very alien to our ancestral environment so we may not have an “instinctive” preference for or against it. Rather, we have to extrapolate that preference from other things.
Two heuristics which might be relevant:
where “wanting” and “liking” conflict, it feels like “wanting” is broken (i.e. we’re making ourselves do things we don’t enjoy). So given the opportunity we might want to update what we “want”. This is pro-wireheading.
where we feel we are being manipulated, we want to fight that manipulation in case it’s against our own interests. Thinking about brain probes is a sort of manipulation-superstimulus, so this heuristic would be anti-wireheading.
I can very well believe that wireheading correlates with personality type, which is a weak form of your “two different minds” hypothesis.
Sorry for the ultra-speculative nature of this post.
Makes sense in terms of explaining the different intuition, yes, and is essentially how I think about it.
The second heuristic about manipulation, then, seems useful in practice (more agents will try to exploit us than satisfy us), but isn’t it much weaker, considering the actual wireheading scenario? The first heuristic actually addresses the conflict (although maybe the wrong way), but the second just ignores it.
I agree; the second heuristic doesn’t apply particularly well to this scenario. Some terminal values seem to come from a part of the brain which isn’t open to introspection, so I’d expect them to arise as a result of evolutionary kludges and random cultural influences rather than necessarily making any logical sense.
The thing is, once we have a value system that’s reasonably stable (i.e. what we want is the same as what we want to want) then we don’t want to change our preferences even if we can’t explain where they arise from.
This is just my opinion, not particularly evidence-based: I don’t think that there are two different kinds of mind, or if there are it’s not this issue that separates them. The wireheading scenario is one which is very alien to our ancestral environment so we may not have an “instinctive” preference for or against it. Rather, we have to extrapolate that preference from other things.
Two heuristics which might be relevant:
where “wanting” and “liking” conflict, it feels like “wanting” is broken (i.e. we’re making ourselves do things we don’t enjoy). So given the opportunity we might want to update what we “want”. This is pro-wireheading.
where we feel we are being manipulated, we want to fight that manipulation in case it’s against our own interests. Thinking about brain probes is a sort of manipulation-superstimulus, so this heuristic would be anti-wireheading.
I can very well believe that wireheading correlates with personality type, which is a weak form of your “two different minds” hypothesis.
Sorry for the ultra-speculative nature of this post.
Makes sense in terms of explaining the different intuition, yes, and is essentially how I think about it.
The second heuristic about manipulation, then, seems useful in practice (more agents will try to exploit us than satisfy us), but isn’t it much weaker, considering the actual wireheading scenario? The first heuristic actually addresses the conflict (although maybe the wrong way), but the second just ignores it.
I agree; the second heuristic doesn’t apply particularly well to this scenario. Some terminal values seem to come from a part of the brain which isn’t open to introspection, so I’d expect them to arise as a result of evolutionary kludges and random cultural influences rather than necessarily making any logical sense.
The thing is, once we have a value system that’s reasonably stable (i.e. what we want is the same as what we want to want) then we don’t want to change our preferences even if we can’t explain where they arise from.