Refer to my disclaimer for the validity of the idea of humans having terminal values. In the context of human values, I think of “terminal values” as the ones directly formed by evolution and hardwired into our brains, and thus broadly shared. The apparent exceptions are rarish and highly associated with childhood neglect and brain damage.
“Broadly shared” is not a significant additional constraint on what I mean by “terminal value”, it’s a passing acknowledgement of the rare counterexamples.
If that’s your argument then we somewhat agree. I’m saying that the model you should use is the model that most efficiently pursues your goals, and (in response to your comment) that utility schemes which terminally value having specific models (and thus whose goals are most efficiently pursued through using said arbitrary terminally valued model and not a more computationally efficient model) are not evidently present among humans in great enough supply for us to expect that that caveat applies to anyone who will read any of these comments.
Real world examples of people who appear at first glance to value having specific models (e.g. religious people) are pretty sketchy- if this is to be believed, you can change someone’s terminal values with the argumentative equivalent of a single rusty musket ball and a rubber band. That defies the sort of behaviors we’d want to see from whatever we’re defining as a “terminal value”, keeping in mind the inconsistencies between the way human value systems are structured and the way the value systems of hypothetical artificial intelligences are structured.
The argumentative strategy required to convince someone to ignore instrumentally unimportant details about the truth of reality looks more like “have a normal conversation with them” than “display a series of colorful flashes as a precursor to the biological equivalent of arbitrary code execution” or otherwise psychologically breaking them in a way sufficient to get them to do basically anything, which is what would be required to cause serious damage to what I’m talking about when I say “terminal values” in the context of humans.
Refer to my disclaimer for the validity of the idea of humans having terminal values. In the context of human values, I think of “terminal values” as the ones directly formed by evolution and hardwired into our brains, and thus broadly shared. The apparent exceptions are rarish and highly associated with childhood neglect and brain damage.
The existence of places like LessWrong, philosophy departments, etc, indicate that people do have some sort of goal to understand things in general, aside from any nitpicking about what is a true terminal value.
If that’s your argument then we somewhat agree. I’m saying that the model you should use is the model that most efficiently pursues your goals,
Well, if my goal is the truth, I am going to want the model that corresponds the best, not the model that predicts most efficiently .
and (in response to your comment) that utility schemes which terminally value having specific models
I’ve already stated than I am not talking about confirming specific models .
The existence of places like LessWrong, philosophy departments, etc, indicate that people do have some sort of goal to understand things in general, aside from any nitpicking about what is a true terminal value.
I agree- lots of people (including me, of course) are learning because they want to- not as part of some instrumental plan to achieve their other goals. I think this is significant evidence that we do terminally value learning. However, the way that I personally have the most fun learning is not the way that is best for cultivating a perfect understanding of reality (nor developing the model which is most instrumentally efficient, for that matter). This indicates that I don’t necessarily want to learn so that I can have the mental model that most accurately describes reality- I have fun learning for complicated reasons which I don’t expect align with any short guiding principle.
Also, at least for now, I get basically all of my expected value from learning from my expectations for being able to leverage that knowledge. I have a lot more fun learning about e.g. history than the things I actually spend my time on, but historical knowledge isn’t nearly as useful, so I’m not spending my time on it.
In retrospect, I should’ve said something more along the lines of “We value understanding in and of itself, but (at least for me, and at least for now) most of the value in our understanding is from its practical role in the advancement of our other goals.”
I’ve already stated than I am not talking about confirming specific models.
There’s been a mix-up here- my meaning for “specific” also includes “whichever model corresponds to reality the best”
Refer to my disclaimer for the validity of the idea of humans having terminal values. In the context of human values, I think of “terminal values” as the ones directly formed by evolution and hardwired into our brains, and thus broadly shared. The apparent exceptions are rarish and highly associated with childhood neglect and brain damage.
“Broadly shared” is not a significant additional constraint on what I mean by “terminal value”, it’s a passing acknowledgement of the rare counterexamples.
If that’s your argument then we somewhat agree. I’m saying that the model you should use is the model that most efficiently pursues your goals, and (in response to your comment) that utility schemes which terminally value having specific models (and thus whose goals are most efficiently pursued through using said arbitrary terminally valued model and not a more computationally efficient model) are not evidently present among humans in great enough supply for us to expect that that caveat applies to anyone who will read any of these comments.
Real world examples of people who appear at first glance to value having specific models (e.g. religious people) are pretty sketchy- if this is to be believed, you can change someone’s terminal values with the argumentative equivalent of a single rusty musket ball and a rubber band. That defies the sort of behaviors we’d want to see from whatever we’re defining as a “terminal value”, keeping in mind the inconsistencies between the way human value systems are structured and the way the value systems of hypothetical artificial intelligences are structured.
The argumentative strategy required to convince someone to ignore instrumentally unimportant details about the truth of reality looks more like “have a normal conversation with them” than “display a series of colorful flashes as a precursor to the biological equivalent of arbitrary code execution” or otherwise psychologically breaking them in a way sufficient to get them to do basically anything, which is what would be required to cause serious damage to what I’m talking about when I say “terminal values” in the context of humans.
The existence of places like LessWrong, philosophy departments, etc, indicate that people do have some sort of goal to understand things in general, aside from any nitpicking about what is a true terminal value.
Well, if my goal is the truth, I am going to want the model that corresponds the best, not the model that predicts most efficiently .
I’ve already stated than I am not talking about confirming specific models .
I agree- lots of people (including me, of course) are learning because they want to- not as part of some instrumental plan to achieve their other goals. I think this is significant evidence that we do terminally value learning. However, the way that I personally have the most fun learning is not the way that is best for cultivating a perfect understanding of reality (nor developing the model which is most instrumentally efficient, for that matter). This indicates that I don’t necessarily want to learn so that I can have the mental model that most accurately describes reality- I have fun learning for complicated reasons which I don’t expect align with any short guiding principle.
Also, at least for now, I get basically all of my expected value from learning from my expectations for being able to leverage that knowledge. I have a lot more fun learning about e.g. history than the things I actually spend my time on, but historical knowledge isn’t nearly as useful, so I’m not spending my time on it.
In retrospect, I should’ve said something more along the lines of “We value understanding in and of itself, but (at least for me, and at least for now) most of the value in our understanding is from its practical role in the advancement of our other goals.”
There’s been a mix-up here- my meaning for “specific” also includes “whichever model corresponds to reality the best”