I do not think that literally any mental model of a person is a person, though I do draw the line further than you.
What are your reasons for thinking that mental models are closer to markov models than tulpas? My reason for leaning more on the latter side is my own experience writing, where I found it easy to create mental models of characters who behaved coherently and with whom I could have long conversations on a level above even GPT4, let alone markov models.
Another piece of evidence is this study. I haven’t done any actual digging to see if the methodology is any good, all I did was see the given statistic, but it is a much higher percentage than even I would have predicted before seeing it, and I already believed everything I wrote in this post!
Though I should be clear that whether or not a mental model is a person depends on the level of detail, and surely there are a lot that are not detailed enough to qualify. I just also think that there are a lot that do have enough detail, especially among writers.
That said, a lot of the reasons humans want to continue their thread of experience probably don’t apply to most tulpas (e.g. when a human dies, the substrate they were running on stops functioning, all their memories are lost, and they lose their ability to steer the world towards states they prefer whereas if a tulpa “dies” its memories are retained and its substrate remains intact, though it still I think loses its ability to steer the world towards its preferred states).
I find it interesting that multiple people have brought up “memories aren’t lost” as part of why it’s less bad for mental models or tulpas to die, since I personally don’t care if my memories live on after I die and would not consider that to be even close to true immortality.
What are your reasons for thinking that mental models are closer to markov models than tulpas?
I think this may just be a case of the typical mind fallacy: I don’t model people in that level of detail in practice and I’m not even sure I’m capable of doing so. I can make predictions about “the kind of thing a person might say” based on what they’ve said before, but those predictions are more at the level of turns-of-phrase and favored topics of conversation—definitely nothing like “long conversations on a level above GPT-4″.
The “why people value remaining alive” bit might also be a typical mind fallacy thing. I mostly think about personal identity in terms of memories + preferences.
I do agree that my memories alone living on after my body dies would not be close to immortality to me. However, if someone were to train a multimodal ML model that can produce actions in the world indistinguishable from the actions I produce (or even “distinguishable but very very close”), I would consider that to be most of the way to effectively being immortal, assuming that model were actually run and had the ability to steer the world towards states which it prefers. Conversely, I’d consider it effectively-death to be locked in a box where I couldn’t affect the state of the outside world and would never be able to exit the box. The scenario “my knowledge persists and can be used by people who share my values” would be worse, to me, than remaining alive but better than death without preserving my knowledge for people who share my values (and by “share my values” I basically just mean “are not actively trying to do things that I disprefer specifically because I disprefer them”).
I wouldn’t quite say it’s a typical mind fallacy, because I am not assuming that everyone is like me. I’m just also not assuming that everyone is different from me, and using heuristics to support my inference that it’s probably not too uncommon, such as reports by authors of their characters surprising them. Another small factor in my inference is the fact that I don’t know how I’d write good fiction without making mental models that qualified as people, though admittedly I have very high standards with respect to characterization in fiction.
(I am aware that I am not consistent about which phrase I use to describe just how common it is for models to qualify as people. This is because I don’t actually know how common it is, I only have inferences based on the evidence I already gave to go on.)
The rest of your post is interesting and I think I agree with it, though we’ve digressed from the original subject on that part.
I do not think that literally any mental model of a person is a person, though I do draw the line further than you.
What are your reasons for thinking that mental models are closer to markov models than tulpas? My reason for leaning more on the latter side is my own experience writing, where I found it easy to create mental models of characters who behaved coherently and with whom I could have long conversations on a level above even GPT4, let alone markov models.
Another piece of evidence is this study. I haven’t done any actual digging to see if the methodology is any good, all I did was see the given statistic, but it is a much higher percentage than even I would have predicted before seeing it, and I already believed everything I wrote in this post!
Though I should be clear that whether or not a mental model is a person depends on the level of detail, and surely there are a lot that are not detailed enough to qualify. I just also think that there are a lot that do have enough detail, especially among writers.
I find it interesting that multiple people have brought up “memories aren’t lost” as part of why it’s less bad for mental models or tulpas to die, since I personally don’t care if my memories live on after I die and would not consider that to be even close to true immortality.
I think this may just be a case of the typical mind fallacy: I don’t model people in that level of detail in practice and I’m not even sure I’m capable of doing so. I can make predictions about “the kind of thing a person might say” based on what they’ve said before, but those predictions are more at the level of turns-of-phrase and favored topics of conversation—definitely nothing like “long conversations on a level above GPT-4″.
The “why people value remaining alive” bit might also be a typical mind fallacy thing. I mostly think about personal identity in terms of memories + preferences.
I do agree that my memories alone living on after my body dies would not be close to immortality to me. However, if someone were to train a multimodal ML model that can produce actions in the world indistinguishable from the actions I produce (or even “distinguishable but very very close”), I would consider that to be most of the way to effectively being immortal, assuming that model were actually run and had the ability to steer the world towards states which it prefers. Conversely, I’d consider it effectively-death to be locked in a box where I couldn’t affect the state of the outside world and would never be able to exit the box. The scenario “my knowledge persists and can be used by people who share my values” would be worse, to me, than remaining alive but better than death without preserving my knowledge for people who share my values (and by “share my values” I basically just mean “are not actively trying to do things that I disprefer specifically because I disprefer them”).
I wouldn’t quite say it’s a typical mind fallacy, because I am not assuming that everyone is like me. I’m just also not assuming that everyone is different from me, and using heuristics to support my inference that it’s probably not too uncommon, such as reports by authors of their characters surprising them. Another small factor in my inference is the fact that I don’t know how I’d write good fiction without making mental models that qualified as people, though admittedly I have very high standards with respect to characterization in fiction.
(I am aware that I am not consistent about which phrase I use to describe just how common it is for models to qualify as people. This is because I don’t actually know how common it is, I only have inferences based on the evidence I already gave to go on.)
The rest of your post is interesting and I think I agree with it, though we’ve digressed from the original subject on that part.
Thanks for you replies.