I agree that fictional/cultural evidence is important for how people generalise their innate responses to new stimuli. Specifically, I think something similar to Steven Byrnes’ proxy matching is going on.
The idea is that we have certain hardwired instincts such as fear of death that are triggered in specific scenarios and we also independently learn a general world-model based on unsupervised learning which learns an independent and potentially un-emotive concept of death. Then we associate our instinctive reactions with this concept such that eventually our instinctive reactions generalise to other stimuli that also evoke this concept such as ones that are not present in the ancestral environment and for which we don’t have hardwired reactions to.
The fiction and cultural knowledge etc are super important for shaping our unsupervised concept of death—since they are the training data! The limits of this generalisation can also be interestingly seen in some cases where there is a lot of disagreement between people—for example in the classic case of a teleporter which scrambles your atoms at location X only to recreate an exact copy of you at location Y, people have very different instinctive reactions to whether this is ‘death’ or not which ultimately depends on their world-model concept and not on any hardwired reaction since there are no teleporters in the ancestral environment or now.
I suspect a process like this is also what generates ‘human values’ and will be writing something up on this shortly.
I agree that fictional/cultural evidence is important for how people generalise their innate responses to new stimuli. Specifically, I think something similar to Steven Byrnes’ proxy matching is going on.
The idea is that we have certain hardwired instincts such as fear of death that are triggered in specific scenarios and we also independently learn a general world-model based on unsupervised learning which learns an independent and potentially un-emotive concept of death. Then we associate our instinctive reactions with this concept such that eventually our instinctive reactions generalise to other stimuli that also evoke this concept such as ones that are not present in the ancestral environment and for which we don’t have hardwired reactions to.
The fiction and cultural knowledge etc are super important for shaping our unsupervised concept of death—since they are the training data! The limits of this generalisation can also be interestingly seen in some cases where there is a lot of disagreement between people—for example in the classic case of a teleporter which scrambles your atoms at location X only to recreate an exact copy of you at location Y, people have very different instinctive reactions to whether this is ‘death’ or not which ultimately depends on their world-model concept and not on any hardwired reaction since there are no teleporters in the ancestral environment or now.
I suspect a process like this is also what generates ‘human values’ and will be writing something up on this shortly.
I do want to note that it can also hijack instrumental convergence in order to achieve alignment.