You don’t want to “overcome” a lot of what millions of years of brain evolution have formed, only some things.
But how do we choose what we want to “overcome” and what not? I suppose it is up to the answer to the question, what constitutes winning? If rationality is about winning and winning means to achieve goals given to us by evolution, to satisfy our desires that have been implemented by evolution, then it needs to be able to discern goals and desires from biases and fallacies.
If winning means to satisfy our evolutionary template, all of our complex values, desires and goals, what is it that rationality is doing by helping us to win? How does it measure success? If I give in to Akrasia, how did I fail to satisfy an evolutionary desire? If I procrastinate, how does rationality measure that I am acting irrationally? What is the unit in which success is measured?
Hyperbolic discounting is called a bias. Yet biases are largely a result of our evolution, just as our desire for complexity. You read LessWrong and are told to change your mind, overcome bias and disregard discounting as a biased desire that leads to a suboptimal outcome. The question we have to answer is why we don’t go a step further and disregard human nature altogether in favor of something that can value the possibilities of reality maximally? Where do we draw the line?
If rationality is prescriptive and can say No to procrastination and Yes to donating money to the Singularity Institute for Artificial Intelligence, then it is already telling us to disregard most of our desires to realize its own idea of what we ought to do. Why then is it irrational to say No to evolutionary values and Yes to whatever maximizes what rationality is measuring?
But how do we choose what we want to “overcome” and what not?
Which parts of our evolved preferences we embrace seems to be partly a matter of taste—though most agree that there are problems with consuming too much chocolate gateau.
SIAI’s argument for donations could be reduced to “or else everybody dies.” Survival of self, offspring, and/or kin is very much an evolutionary value.
Yes, of course. What I wanted to ask about is why we don’t apply rationality to choose our values as well. We already call it rational to most effectively achieve our values. Yet we also use rationality to discern the utility of different values and choose to maximize those with the highest expected utility while disregarding others. We also use rationality to detect inconsistencies in our actions, thinking and beliefs. Why don’t we apply this to values as well? And as I said in the previous comment, we already do so to some extent, but where do we draw the line? If utility is strongly correlated with the amount of expected pleasure then how wouldn’t it be rational to change our desires completely if there was a set of desires that could be more efficiently satisfied to yield the largest amount of pleasure? That is what I tried to say by the comparing procrastination to donating to the SIAI. We pick the SIAI rather than our evolutionary desire to rest because we pick the more important goal of two mutually exclusive goals. Vladimir Nesov wrote that we don’t want to overcome most of what evolution came up with and I asked why that is the case? Why not pick the highest order goal, maybe experiencing pleasure, and try to maximize that? Is it really rational to have complex goals if having narrow goals yields more utility? The comment is now unnecessarily downvoted to −5, so I suppose something is wrong with it. Yet the reason that I posted it is to figure out what is wrong, where I am confused. Sadly nobody commented and I can’t infer what is wrong from mere downvotes in this case.
My $0.02: if I believed that I had a single highest-order goal—that is, something I would always want to maximize no matter what the external situation was -- I would in fact endorse trying to maximize that.
But I don’t believe that I have such a thing, and I’ve seen no convincing evidence that anyone else does either.
Certainly I don’t believe that pleasure qualifies.
Incidentally, I also don’t believe that equating “procrastination” with “our evolutionary desire to rest” is even remotely correct.
Incidentally, I also don’t believe that equating “procrastination” with “our evolutionary desire to rest” is even remotely correct.
I got that from here. For everything else see this comment, would love to hear your opinion. I am not sure what exactly it is that I am confused about. Thank you.
I might agree that we consciously convince ourselves that procrastination is a form of rest, as one of many ways of rationalizing procrastination. But not that we’ve evolved procrastination as a way of satisfying our desire to rest, which is what I understood you to be implying.
Then again, I mostly didn’t buy that post at all, so take my reaction for what it’s worth.
As for the terminal/instrumental goal thing… here, also, I’m probably the wrong guy to ask, as I don’t really buy into the standard LW position.
As I’ve said a few times (most coherently here), I’m not actually sold on the idea that terminal goals exist in the first place.
So, yeah, I would agree that goals change over time, or at least can change, and utility functions (insofar as such things exist) change, and that this whole idea of human values being a fixed fulcrum against which we can measure the motion of the universe isn’t quite right. We are part of the world we change, not some kind of transcendent Unmoved Mover that stands outside of it.
Which is not to say I oppose the enterprise of building optimizing agents in a way that preserves our “terminal values”: that’s the right direction to go in. It’s just that I expect the result of that enterprise will in fact be that we preserve our most stable and mutually-reinforcing “instrumental” values, and they will be somewhat modified by the process.
Growing up is like that sometimes… we become something we couldn’t have conceived of and wouldn’t have approved of, had we been consulted.
To put this in LW terms: I expect that what a sufficiently powerful seed AI extracts from an analysis of humanity’s coherent extrapolated volition will not be, technically speaking, a set of terminal values… rather, I expect it will be a set of particularly stable and mutually reinforcing instrumental values, which are the closest approximation human minds contain to terminal values.
And I don’t expect that implementing that CEV will be a one-time operation where we do it and we Win and nothing more needs to be done ever… rather, I expect that it will be a radical improvement in our environment, which we will become accustomed to, which will alter the balance of our values, which will cause us to identify new goals that we optimize for.
All that said, I don’t object to people making the simplifying assumption that their current targets are actually universal terminal points. For some people (and I’m often one of them!), simplicity is an important motivational factor, and the alternative is to just sit in a paralyzed puddle of inconceivable alternatives.
Changing terminal values is almost always negative-utility according to the original values, which are the ones you use to decide whether to switch or not. If you delete a goal to focus on some other goal, then the deleted goal won’t be fulfilled. While you might not care anymore after it’s deleted, you care at the moment you’re deciding whether to delete it or not, so you won’t do it.
Where rationality helps is with instrumental values, ie goals that we have only because we think they’ll further our other goals. For example, if I want to get through a television series because I think it’ll make me happy, but all the time sitting around actually makes me depressed, then it’s rational to give up that goal. On the other hand, if I want to eliminate poverty in Obscureistan, but I find out that achieving this won’t make me happy, that doesn’t make me change my goal at all.
On the other hand, if I want to eliminate poverty in Obscureistan, but I find out that achieving this won’t make me happy, that doesn’t make me change my goal at all.
But how do you know that this line of reasoning is not culturally induced and the result of abstract high-order contemplations about rational conduct? My problem is that I perceive rationality to change and introduce terminal goals. The toolkit that is called ‘rationality’, the rules and heuristics developed to help us to achieve our terminal goals are also altering and deleting them. A stone age hunter-gatherer seems to possess very different values than I do. If he learns about rationality and moral ontology his values will be altered considerably. Rationality was meant to help him achieve his goals, e.g. become a better hunter. Rationality was designed to tell him what he ought to do (instrumental goals) to achieve what he wants to do (terminal goals). Yet what actually happens is that he is told, that he will learn what he ought to want. If an agent becomes more knowledgeable and smarter then this does not leave its goal-reward-system intact if it is not especially designed to be stable. An agent who originally wanted to become a better hunter and feed his tribe would end up wanting to eliminate poverty in Obscureistan. The question is, how much of this new “wanting” is the result of using rationality to achieve terminal goals and how much is a side-effect of using rationality, how much is left of the original values versus the values induced by a feedback loop between the toolkit and its user? Here I think it would be important to ask how humans assign utility, if there exist some sort of intrinsic property that makes agents assign more utility to some experiences and outcomes versus others. We have to discern what we actually want from what we think we ought to want. This might sound contradictory, but I don’t think it is. If an agent is facing the Prisoner’s dilemma that agent might originally tend to cooperate and only after learning about game theory decide to defect and gain a greater payoff. Was it rational for the agent to learn about game theory, in the sense that it helped the agent to achieve its goal or in the sense that it deleted one of its goals in exchange for a more “valuable” goal? It seems to me that becoming more knowledgeable and smarter is gradually altering our utility functions. But what is it that we are approaching if rationality becomes a purpose in and of itself? If we can be biased, if our map of the territory can be distorted, why can’t we be wrong about what we value as well? If that is possible, how can we discover better values? What rationality is doing is to extrapolate our volition to calculate the expected utility of different outcomes. But this might distort or alter what we really value by installing new cognitive toolkits designed to achieve an equilibrium between us and other agents with the same toolkit. This is why I think it might be important to figure out what all high-utility goals have in common. Here happiness is just an example, I am not claiming that happiness is strongly correlated with utility, that happiness is the highest order goal. One might argue that we would choose a world state in which all sentient agents are maximally happy over one where all sentient agents achieved arbitrary goals but are on average not happy about it. But is this true? I don’t know. I am just saying that we might want to reconsider what we mean by “utility” and objectify its definition. Otherwise the claim that we don’t want to “overcome” a lot of what millions of years of brain evolution have formed is not even wrong because if we are unable to prove some sort of human-goal-stability then what we want is a fact about our cultural and intellectual evolution more than a fact about us, about human nature. Are we using our tools or are the tools using us, are we creating models or are we modeled, are we extrapolating our volition or following our extrapolations?
I was led to this comment by your request for assistance here. You seem to be asking about the relationship between our intuitive values and our attempts to systematize those values rationally. To what extent should we let our intuitions guide the construction of our theories? To what extent should we allow our theories to reform and override our intuitions?
My own take on this is that there can be levels (degrees of stability) of equilibria. For example, the foundational idea of utility and expected utility maximization (as axiomatized by Savage or Aumann) strikes me as pretty solid. But when you add on additional superstructure (such as interpersonal comparison of utilities, or universalist, consequentialist ethics) it becomes more and more difficult to bring the axiomatic structure into equilibrium with the raw intuitions of everyone.
I think it would be important to ask how humans assign utility, if there exist some sort of intrinsic property that makes agents assign more utility to some experiences and outcomes versus others.
As I understand it, the equation looks something like: warmth + orgasms x 100 - thirst x 5 - hunger x 2 - pain x 10.
Is there a reason you think that maximizing our own pleasure is our highest order goal? Do you have an explanation for the fact that if you ask a hedonist to name their heroes, those heroes’ accomplishments are often not of the form “man, that guy could party,” but instead lie elsewhere?
I don’t know what you’re getting at. Is there a problem with the statement, “It’s not easy to overcome millions of years of brain evolution”?
You don’t want to “overcome” a lot of what millions of years of brain evolution have formed, only some things.
But how do we choose what we want to “overcome” and what not? I suppose it is up to the answer to the question, what constitutes winning? If rationality is about winning and winning means to achieve goals given to us by evolution, to satisfy our desires that have been implemented by evolution, then it needs to be able to discern goals and desires from biases and fallacies.
If winning means to satisfy our evolutionary template, all of our complex values, desires and goals, what is it that rationality is doing by helping us to win? How does it measure success? If I give in to Akrasia, how did I fail to satisfy an evolutionary desire? If I procrastinate, how does rationality measure that I am acting irrationally? What is the unit in which success is measured?
Hyperbolic discounting is called a bias. Yet biases are largely a result of our evolution, just as our desire for complexity. You read LessWrong and are told to change your mind, overcome bias and disregard discounting as a biased desire that leads to a suboptimal outcome. The question we have to answer is why we don’t go a step further and disregard human nature altogether in favor of something that can value the possibilities of reality maximally? Where do we draw the line?
If rationality is prescriptive and can say No to procrastination and Yes to donating money to the Singularity Institute for Artificial Intelligence, then it is already telling us to disregard most of our desires to realize its own idea of what we ought to do. Why then is it irrational to say No to evolutionary values and Yes to whatever maximizes what rationality is measuring?
ETA: For more see the discussion here.
Which parts of our evolved preferences we embrace seems to be partly a matter of taste—though most agree that there are problems with consuming too much chocolate gateau.
SIAI’s argument for donations could be reduced to “or else everybody dies.” Survival of self, offspring, and/or kin is very much an evolutionary value.
Yes, of course. What I wanted to ask about is why we don’t apply rationality to choose our values as well. We already call it rational to most effectively achieve our values. Yet we also use rationality to discern the utility of different values and choose to maximize those with the highest expected utility while disregarding others. We also use rationality to detect inconsistencies in our actions, thinking and beliefs. Why don’t we apply this to values as well? And as I said in the previous comment, we already do so to some extent, but where do we draw the line? If utility is strongly correlated with the amount of expected pleasure then how wouldn’t it be rational to change our desires completely if there was a set of desires that could be more efficiently satisfied to yield the largest amount of pleasure? That is what I tried to say by the comparing procrastination to donating to the SIAI. We pick the SIAI rather than our evolutionary desire to rest because we pick the more important goal of two mutually exclusive goals. Vladimir Nesov wrote that we don’t want to overcome most of what evolution came up with and I asked why that is the case? Why not pick the highest order goal, maybe experiencing pleasure, and try to maximize that? Is it really rational to have complex goals if having narrow goals yields more utility? The comment is now unnecessarily downvoted to −5, so I suppose something is wrong with it. Yet the reason that I posted it is to figure out what is wrong, where I am confused. Sadly nobody commented and I can’t infer what is wrong from mere downvotes in this case.
My $0.02: if I believed that I had a single highest-order goal—that is, something I would always want to maximize no matter what the external situation was -- I would in fact endorse trying to maximize that.
But I don’t believe that I have such a thing, and I’ve seen no convincing evidence that anyone else does either.
Certainly I don’t believe that pleasure qualifies.
Incidentally, I also don’t believe that equating “procrastination” with “our evolutionary desire to rest” is even remotely correct.
I got that from here. For everything else see this comment, would love to hear your opinion. I am not sure what exactly it is that I am confused about. Thank you.
I might agree that we consciously convince ourselves that procrastination is a form of rest, as one of many ways of rationalizing procrastination. But not that we’ve evolved procrastination as a way of satisfying our desire to rest, which is what I understood you to be implying.
Then again, I mostly didn’t buy that post at all, so take my reaction for what it’s worth.
As for the terminal/instrumental goal thing… here, also, I’m probably the wrong guy to ask, as I don’t really buy into the standard LW position.
As I’ve said a few times (most coherently here), I’m not actually sold on the idea that terminal goals exist in the first place.
So, yeah, I would agree that goals change over time, or at least can change, and utility functions (insofar as such things exist) change, and that this whole idea of human values being a fixed fulcrum against which we can measure the motion of the universe isn’t quite right. We are part of the world we change, not some kind of transcendent Unmoved Mover that stands outside of it.
Which is not to say I oppose the enterprise of building optimizing agents in a way that preserves our “terminal values”: that’s the right direction to go in. It’s just that I expect the result of that enterprise will in fact be that we preserve our most stable and mutually-reinforcing “instrumental” values, and they will be somewhat modified by the process.
Growing up is like that sometimes… we become something we couldn’t have conceived of and wouldn’t have approved of, had we been consulted.
To put this in LW terms: I expect that what a sufficiently powerful seed AI extracts from an analysis of humanity’s coherent extrapolated volition will not be, technically speaking, a set of terminal values… rather, I expect it will be a set of particularly stable and mutually reinforcing instrumental values, which are the closest approximation human minds contain to terminal values.
And I don’t expect that implementing that CEV will be a one-time operation where we do it and we Win and nothing more needs to be done ever… rather, I expect that it will be a radical improvement in our environment, which we will become accustomed to, which will alter the balance of our values, which will cause us to identify new goals that we optimize for.
All that said, I don’t object to people making the simplifying assumption that their current targets are actually universal terminal points. For some people (and I’m often one of them!), simplicity is an important motivational factor, and the alternative is to just sit in a paralyzed puddle of inconceivable alternatives.
Changing terminal values is almost always negative-utility according to the original values, which are the ones you use to decide whether to switch or not. If you delete a goal to focus on some other goal, then the deleted goal won’t be fulfilled. While you might not care anymore after it’s deleted, you care at the moment you’re deciding whether to delete it or not, so you won’t do it.
Where rationality helps is with instrumental values, ie goals that we have only because we think they’ll further our other goals. For example, if I want to get through a television series because I think it’ll make me happy, but all the time sitting around actually makes me depressed, then it’s rational to give up that goal. On the other hand, if I want to eliminate poverty in Obscureistan, but I find out that achieving this won’t make me happy, that doesn’t make me change my goal at all.
I am aware of what it means to be rational.
But how do you know that this line of reasoning is not culturally induced and the result of abstract high-order contemplations about rational conduct? My problem is that I perceive rationality to change and introduce terminal goals. The toolkit that is called ‘rationality’, the rules and heuristics developed to help us to achieve our terminal goals are also altering and deleting them. A stone age hunter-gatherer seems to possess very different values than I do. If he learns about rationality and moral ontology his values will be altered considerably. Rationality was meant to help him achieve his goals, e.g. become a better hunter. Rationality was designed to tell him what he ought to do (instrumental goals) to achieve what he wants to do (terminal goals). Yet what actually happens is that he is told, that he will learn what he ought to want. If an agent becomes more knowledgeable and smarter then this does not leave its goal-reward-system intact if it is not especially designed to be stable. An agent who originally wanted to become a better hunter and feed his tribe would end up wanting to eliminate poverty in Obscureistan. The question is, how much of this new “wanting” is the result of using rationality to achieve terminal goals and how much is a side-effect of using rationality, how much is left of the original values versus the values induced by a feedback loop between the toolkit and its user? Here I think it would be important to ask how humans assign utility, if there exist some sort of intrinsic property that makes agents assign more utility to some experiences and outcomes versus others. We have to discern what we actually want from what we think we ought to want. This might sound contradictory, but I don’t think it is. If an agent is facing the Prisoner’s dilemma that agent might originally tend to cooperate and only after learning about game theory decide to defect and gain a greater payoff. Was it rational for the agent to learn about game theory, in the sense that it helped the agent to achieve its goal or in the sense that it deleted one of its goals in exchange for a more “valuable” goal? It seems to me that becoming more knowledgeable and smarter is gradually altering our utility functions. But what is it that we are approaching if rationality becomes a purpose in and of itself? If we can be biased, if our map of the territory can be distorted, why can’t we be wrong about what we value as well? If that is possible, how can we discover better values? What rationality is doing is to extrapolate our volition to calculate the expected utility of different outcomes. But this might distort or alter what we really value by installing new cognitive toolkits designed to achieve an equilibrium between us and other agents with the same toolkit. This is why I think it might be important to figure out what all high-utility goals have in common. Here happiness is just an example, I am not claiming that happiness is strongly correlated with utility, that happiness is the highest order goal. One might argue that we would choose a world state in which all sentient agents are maximally happy over one where all sentient agents achieved arbitrary goals but are on average not happy about it. But is this true? I don’t know. I am just saying that we might want to reconsider what we mean by “utility” and objectify its definition. Otherwise the claim that we don’t want to “overcome” a lot of what millions of years of brain evolution have formed is not even wrong because if we are unable to prove some sort of human-goal-stability then what we want is a fact about our cultural and intellectual evolution more than a fact about us, about human nature. Are we using our tools or are the tools using us, are we creating models or are we modeled, are we extrapolating our volition or following our extrapolations?
I was led to this comment by your request for assistance here. You seem to be asking about the relationship between our intuitive values and our attempts to systematize those values rationally. To what extent should we let our intuitions guide the construction of our theories? To what extent should we allow our theories to reform and override our intuitions?
This is the very important and difficult issue of reflective equilibrium as expounded upon by Goodman and Rawls, to say nothing of Yudkowsky. I hope the links are helpful.
My own take on this is that there can be levels (degrees of stability) of equilibria. For example, the foundational idea of utility and expected utility maximization (as axiomatized by Savage or Aumann) strikes me as pretty solid. But when you add on additional superstructure (such as interpersonal comparison of utilities, or universalist, consequentialist ethics) it becomes more and more difficult to bring the axiomatic structure into equilibrium with the raw intuitions of everyone.
As I understand it, the equation looks something like: warmth + orgasms x 100 - thirst x 5 - hunger x 2 - pain x 10.
Only if they absorbed a bunch of memes about utilitarianism in the process.
Is there a reason you think that maximizing our own pleasure is our highest order goal? Do you have an explanation for the fact that if you ask a hedonist to name their heroes, those heroes’ accomplishments are often not of the form “man, that guy could party,” but instead lie elsewhere?