I’m not sure the proposed modification helps: you seem to have expanded your criticism so far, in order to have them lead to the judgment you want to reach, that they cover too much.
I mean, sure, unpredictability is scarier (for a given level of power) than predictability. Agreed, But so what?
For example, my judgments will always be more unpredictable to people much stupider than I am than to people about as smart or smarter than I am. So the smarter I am, the scarier I am (again, given fixed power)… or, rather, the more people I am scary to… as long as I’m not actively devoting effort to alleviating those fears by, for example, publicly conforming to current fashions of thought. Agreed.
But what follows from that? That I should be less smart? That I should conform more? That I actually represent a danger to more people? I can’t see why I should believe any of those things.
You started out talking about what makes one dangerous; you have ended up talking about what makes people scared of one whether one is dangerous or not. They aren’t equivalent.
you seem to have expanded your criticism so far, in order to have them lead to the judgment you want to reach, that they cover too much.
Well, I hope I haven’t done that.
You started out talking about what makes one dangerous; you have ended up talking about what makes people scared of one whether one is dangerous or not.
Well, I certainly did that. I was trying to address the question more objectively, but it seems I failed. Let me try again from a more subjective, personal position.
If you and I share the same consequentialist values, but I know that you are more intelligent, I may well consider you unpredictable, but I won’t consider you dangerous. I will be confident that your judgments, in pursuit of our shared values, will be at least as good as my own. Your actions may surprise me, but I will usually be pleasantly surprised.
If you and I are of the same intelligence, but we have different consequentialist values (both being egoists, with disjoint egos, for example) then we can expect to disagree on many actions. Expecting the disagreement, we can defend ourselves, or even bargain our way to a Nash bargaining solution in which (to the extent that we can enforce our bargain) we can predict each others behavior to be that promoting compromise consequences.
If, in addition to different values, we also have different beliefs, then bargaining is still possible, though we cannot expect to reach a Pareto optimal bargain. But the more our beliefs diverge, regarding consequences that concern us, the less good our bargains can be. In the limit, when the things that matter to us are particularly difficult to predict, and when we each have no idea what the other agent is predicting, bargaining simply becomes ineffective.
Eliezer has expressed his acceptance of the moral significance of the utility functions of people in the far distant future. Since he believes that those people outnumber us folk in the present, that seems to suggest that he would be willing to sacrifice the current utility of us in favor of the future utility of them. (For example, the positive value of saving a starving child today does not outweigh the negative consequences on the multitudes of the future of delaying the Singularity by one day).
I, on the other hand, systematically discount the future. That, by itself, does not make Eliezer dangerous to me. We could strike a Nash bargain, after all. However, we inevitably also have different beliefs about consequences, and the divergence between our beliefs becomes greater the farther into the future we look. And consequences in the distant future are essentially all that matters to people like Eliezer—the present fades into insignificance by contrast. But, to people like me, the present and near future are essentially all that matter—the distant future discounts into insignificance.
So, Eliezer and I care about different things. Eliezer has some ability to predict my actions because he knows I care about short-term consequences and he knows something about how I predict short-term consequences. But I have little ability to predict Eliezer’s actions, because I know he cares primarily about long term consequences, and they are inherently much more unpredictable. I really have very little justification for modeling Eliezer (and any other act utilitarian who refuses to discount the future) as a rational agent.
I really have very little justification for modeling Eliezer (and any other act utilitarian who refuses to discount the future) as a rational agent.
I wish you would just pretend that they care about things a million times further into the future than you do.
The reason is that there are instrumental reasons to discount—the future disappears into a fog of uncertainty—and you can’t make decisions based on the value of things you can’t forsee.
The instrumental reasons fairly quickly dominate as you look further out—even when you don’t discount in your values. Reading your post, it seems as though you don’t “get” this, or don’t agree with it—or something.
Yes, the far-future is unpredictable—but in decision theory, that tends to make it a uniform grey—not an unpredictable black and white strobing pattern.
I wish you would just pretend that they care about things a million times further into the future than you do.
I don’t need to pretend. Modulo some mathematical details, it is the simple truth. And I don’t think there is anything irrational about having such preferences. It is just that, since I cannot tell whether or not what I do will make such people happy, I have no motive to pay any attention to their preferences.
Yes, the far-future is unpredictable—but in decision theory, that tends to make it a uniform grey—not an unpredictable black and white strobing pattern.
Yet, it seems that the people who care about the future do not agree with you on that. Bostrom, Yudkowsky, Nesov, et al. frequently invoke assessments of far-future consequences (sometimes in distant galaxies) in justifying their recommendations.
I wish you would just pretend that they care about things a million times further into the future than you do.
I don’t need to pretend. Modulo some mathematical details, it is the simple truth.
We have crossed wires here. What I meant is that I wish you would stop protesting about infinite utilities—and how non-discounters are not really even rational agents—and just model them as ordinary agents who discount a lot less than you do.
Objections about infinity strike me as irrelevant and uninteresting.
It is just that, since I cannot tell whether or not what I do will make such people happy, I have no motive to pay any attention to their preferences.
Is that your true objection? I expect you can figure out what would make these people happy fairly easily enough most of the time—e.g. by asking them.
Yes, the far-future is unpredictable—but in decision theory, that tends to make it a uniform grey—not an unpredictable black and white strobing pattern.
Yet, it seems that the people who care about the future do not agree with you on that. Bostrom, Yudkowsky, Nesov, et al. frequently invoke assessments of far-future consequences (sometimes in distant galaxies) in justifying their recommendations.
Indeed. That is partly poetry, though (big numbers make things seem important) - and partly because they think that the far future will be highly contingent on near future events.
The thing they are actually interested in influencing is mostly only a decade or so out. It does seem quite important—significant enough to reach back to us here anyway.
If what you are trying to understand is far enough away to be difficult to predict, and very important, then that might cause some oscillations. That is hardly a common situation, though.
Most of the time, organisms act as though want to become ancestors. To do that,
the best thing they can do is focus on having some grandkids. Expanding their circle of care out a few generations usually makes precious little difference to their actions. The far future is unforseen, and usually can’t be directly influenced. It is usually not too relevant. Usually, you leave it to your kids to deal with.
It is just that, since I cannot tell whether or not what I do will make such people happy, I have no motive to pay any attention to their preferences.
Is that your true objection? I expect you can figure out what would make these people happy fairly easily enough most of the time—e.g. by asking them.
That is a valid point. So, I am justified in treating them as rational agents to the extent that I can engage in trade with them. I just can’t enter into a long-term Nash bargain with them in which we jointly pledge to maximize some linear combination of our two utility functions in an unsupervised fashion. They can’t trust me to do what they want, and I can’t trust them to judge their own utility as bounded.
I think this is back to the point about infinities. The one I wish you would stop bringing up—and instead treat these folk as though they are discounting only a teeny, tiny bit.
Frankly, I generally find it hard to take these utilitarian types seriously in the first place. A “signalling” theory (holier-than-thou) explains the unusually high prevalance of utilitarianism among moral philosophers—and an “exploitation” theory explains its prevalance among those running charitable causes (utilitarianism-says-give-us-your-money). Those explanations do a good job of modelling the facts about utilitarianism—and are normally a lot more credible than the supplied justifications—IMHO.
I think this is back to the point about infinities.
Which suggests that we are failing to communicate. I am not surprised.
The one I wish you would stop bringing up—and instead treat these folk as though they are discounting only a teeny, tiny bit.
I do that! And I still discover that their utility functions are dominated by huge positive and negative utilities in the distant future, while mine are dominated by modest positive and negative utilities in the near future. They are still wrong even if they fudge it so that their math works.
I think this is back to the point about infinities.
Which suggests that we are failing to communicate. I am not surprised.
I went from your “I can’t trust them to judge their own utility as bounded” to your earlier “infinity” point. Possibly I am not trying very hard here, though...
My main issue was you apparently thinking that you couldn’t predict their desires in order to find mutually beneficial trades. I’m not really sure if this business about not being able to agree to maximise some shared function is a big deal for you.
Mm. OK, so you are talking about scaring sufficiently intelligent rationalists, not scaring the general public. Fair enough.
What you say makes sense as far as it goes, assuming some mechanism for reliable judgments about people’s actual bases for their decisions. (For example, believing their self-reports.)
But it seems the question that should concern you is not whether Eliezer bases his decisions on predictable things, but rather whether Eliezer’s decisions are themselves predictable.
Put a different way: by your own account, the actual long-term consequences don’t correlate reliably with Eliezer’s expectations about them… that’s what it means for those consequences to be inherently unpredictable. And his decisions are based on his expectations, of course, not on the actual future consequences. So it seems to follow that once you know Eliezer’s beliefs about the future, whether those beliefs are right or wrong is irrelevant to you: that just affects what actually happens in the future, which you systematically discount anyway.
So if Eliezer is consistent in his beliefs about the future, and his decisions are consistently grounded in those beliefs, I’m not sure what makes him any less predictable to me than you are.
Of course, his expectations might not be consistent. Or they might be consistent but beyond your ability to predict. Or his decisions might be more arbitrary than you suggest here. For that matter, he might be lying outright. I’m not saying you should necessarily trust him, or anyone else.
But those same concerns apply to everybody, whatever their professed value structure. I would say the same things about myself.
So it seems to follow that once you know Eliezer’s beliefs about the future, whether those beliefs are right or wrong is irrelevant to you: that just affects what actually happens in the future, which you systematically discount anyway.
But Eliezer’s beliefs about the future continue to change—as he gains new information and completes new deductions. And there is no way that he can practically keep me informed of his beliefs—neither he nor I would be willing to invest the time required for that communication. But Eliezer’s beliefs about the future impact his actions in the present, and those actions have consequences both in the near and distant future. From my point of view, therefore, his actions have essentially random effects on the only thing that matters to me—the near future.
Absolutely. But who isn’t that true of? At least Eliezer has extensively documented his putative beliefs at various points in time, which gives you some data points to extrapolate from.
I have no complaints regarding the amount of information about Eliezer’s beliefs that I have access to. My complaint is that Eliezer, and his fellow non-discounting act utilitarians, are morally driven by the huge differences in utility which they see as arising from events in the distant future—events which I consider morally irrelevant because I discount the future. No realistic amount of information about beliefs can alleviate this problem. The only fix is for them to start discounting. (I would have added “or for me to stop discounting” except that I still don’t know how to handle the infinities.)
Given that they predominantly care about things I don’t care about, and that I predominantly care about things they don’t worry about, we can only consider each other to be moral monsters.
You and I seem to be talking past each other now. It may be time to shut this conversation down.
Given that they predominantly care about things I don’t care about, and that I predominantly care about things they don’t worry about, we can only consider each other to be moral monsters.
Ethical egoists are surely used to this situation, though. The world is full of people who care about extremely different things from one another.
Yes. And if they both mostly care about modest-sized predictable things, then they can do some rational bargaining. Trouble arises when one or more of them has exquisitely fragile values—when they believe that switching a donation from one charity to another destroys galaxies.
I expect your decision algorithm will find a way to deal with people who won’t negotiate on some topics—or who behave in manner you have a hard time predicting. Some trouble for you, maybe—but probably not THE END OF THE WORLD.
From my point of view, therefore, his actions have essentially random effects on the only thing that matters to me—the near future.
Looking at the last 10 years, there seems to be some highly-predictable fund raising activity, and a lot of philosophising about the importance of machine morality.
I see some significant patterns there. It is not remotely like a stream of random events. So: what gives?
I’m not sure the proposed modification helps: you seem to have expanded your criticism so far, in order to have them lead to the judgment you want to reach, that they cover too much.
I mean, sure, unpredictability is scarier (for a given level of power) than predictability. Agreed, But so what?
For example, my judgments will always be more unpredictable to people much stupider than I am than to people about as smart or smarter than I am. So the smarter I am, the scarier I am (again, given fixed power)… or, rather, the more people I am scary to… as long as I’m not actively devoting effort to alleviating those fears by, for example, publicly conforming to current fashions of thought. Agreed.
But what follows from that? That I should be less smart? That I should conform more? That I actually represent a danger to more people? I can’t see why I should believe any of those things.
You started out talking about what makes one dangerous; you have ended up talking about what makes people scared of one whether one is dangerous or not. They aren’t equivalent.
Well, I hope I haven’t done that.
Well, I certainly did that. I was trying to address the question more objectively, but it seems I failed. Let me try again from a more subjective, personal position.
If you and I share the same consequentialist values, but I know that you are more intelligent, I may well consider you unpredictable, but I won’t consider you dangerous. I will be confident that your judgments, in pursuit of our shared values, will be at least as good as my own. Your actions may surprise me, but I will usually be pleasantly surprised.
If you and I are of the same intelligence, but we have different consequentialist values (both being egoists, with disjoint egos, for example) then we can expect to disagree on many actions. Expecting the disagreement, we can defend ourselves, or even bargain our way to a Nash bargaining solution in which (to the extent that we can enforce our bargain) we can predict each others behavior to be that promoting compromise consequences.
If, in addition to different values, we also have different beliefs, then bargaining is still possible, though we cannot expect to reach a Pareto optimal bargain. But the more our beliefs diverge, regarding consequences that concern us, the less good our bargains can be. In the limit, when the things that matter to us are particularly difficult to predict, and when we each have no idea what the other agent is predicting, bargaining simply becomes ineffective.
Eliezer has expressed his acceptance of the moral significance of the utility functions of people in the far distant future. Since he believes that those people outnumber us folk in the present, that seems to suggest that he would be willing to sacrifice the current utility of us in favor of the future utility of them. (For example, the positive value of saving a starving child today does not outweigh the negative consequences on the multitudes of the future of delaying the Singularity by one day).
I, on the other hand, systematically discount the future. That, by itself, does not make Eliezer dangerous to me. We could strike a Nash bargain, after all. However, we inevitably also have different beliefs about consequences, and the divergence between our beliefs becomes greater the farther into the future we look. And consequences in the distant future are essentially all that matters to people like Eliezer—the present fades into insignificance by contrast. But, to people like me, the present and near future are essentially all that matter—the distant future discounts into insignificance.
So, Eliezer and I care about different things. Eliezer has some ability to predict my actions because he knows I care about short-term consequences and he knows something about how I predict short-term consequences. But I have little ability to predict Eliezer’s actions, because I know he cares primarily about long term consequences, and they are inherently much more unpredictable. I really have very little justification for modeling Eliezer (and any other act utilitarian who refuses to discount the future) as a rational agent.
I wish you would just pretend that they care about things a million times further into the future than you do.
The reason is that there are instrumental reasons to discount—the future disappears into a fog of uncertainty—and you can’t make decisions based on the value of things you can’t forsee.
The instrumental reasons fairly quickly dominate as you look further out—even when you don’t discount in your values. Reading your post, it seems as though you don’t “get” this, or don’t agree with it—or something.
Yes, the far-future is unpredictable—but in decision theory, that tends to make it a uniform grey—not an unpredictable black and white strobing pattern.
I don’t need to pretend. Modulo some mathematical details, it is the simple truth. And I don’t think there is anything irrational about having such preferences. It is just that, since I cannot tell whether or not what I do will make such people happy, I have no motive to pay any attention to their preferences.
Yet, it seems that the people who care about the future do not agree with you on that. Bostrom, Yudkowsky, Nesov, et al. frequently invoke assessments of far-future consequences (sometimes in distant galaxies) in justifying their recommendations.
We have crossed wires here. What I meant is that I wish you would stop protesting about infinite utilities—and how non-discounters are not really even rational agents—and just model them as ordinary agents who discount a lot less than you do.
Objections about infinity strike me as irrelevant and uninteresting.
Is that your true objection? I expect you can figure out what would make these people happy fairly easily enough most of the time—e.g. by asking them.
Indeed. That is partly poetry, though (big numbers make things seem important) - and partly because they think that the far future will be highly contingent on near future events.
The thing they are actually interested in influencing is mostly only a decade or so out. It does seem quite important—significant enough to reach back to us here anyway.
If what you are trying to understand is far enough away to be difficult to predict, and very important, then that might cause some oscillations. That is hardly a common situation, though.
Most of the time, organisms act as though want to become ancestors. To do that, the best thing they can do is focus on having some grandkids. Expanding their circle of care out a few generations usually makes precious little difference to their actions. The far future is unforseen, and usually can’t be directly influenced. It is usually not too relevant. Usually, you leave it to your kids to deal with.
That is a valid point. So, I am justified in treating them as rational agents to the extent that I can engage in trade with them. I just can’t enter into a long-term Nash bargain with them in which we jointly pledge to maximize some linear combination of our two utility functions in an unsupervised fashion. They can’t trust me to do what they want, and I can’t trust them to judge their own utility as bounded.
I think this is back to the point about infinities. The one I wish you would stop bringing up—and instead treat these folk as though they are discounting only a teeny, tiny bit.
Frankly, I generally find it hard to take these utilitarian types seriously in the first place. A “signalling” theory (holier-than-thou) explains the unusually high prevalance of utilitarianism among moral philosophers—and an “exploitation” theory explains its prevalance among those running charitable causes (utilitarianism-says-give-us-your-money). Those explanations do a good job of modelling the facts about utilitarianism—and are normally a lot more credible than the supplied justifications—IMHO.
Which suggests that we are failing to communicate. I am not surprised.
I do that! And I still discover that their utility functions are dominated by huge positive and negative utilities in the distant future, while mine are dominated by modest positive and negative utilities in the near future. They are still wrong even if they fudge it so that their math works.
I went from your “I can’t trust them to judge their own utility as bounded” to your earlier “infinity” point. Possibly I am not trying very hard here, though...
My main issue was you apparently thinking that you couldn’t predict their desires in order to find mutually beneficial trades. I’m not really sure if this business about not being able to agree to maximise some shared function is a big deal for you.
Mm. OK, so you are talking about scaring sufficiently intelligent rationalists, not scaring the general public. Fair enough.
What you say makes sense as far as it goes, assuming some mechanism for reliable judgments about people’s actual bases for their decisions. (For example, believing their self-reports.)
But it seems the question that should concern you is not whether Eliezer bases his decisions on predictable things, but rather whether Eliezer’s decisions are themselves predictable.
Put a different way: by your own account, the actual long-term consequences don’t correlate reliably with Eliezer’s expectations about them… that’s what it means for those consequences to be inherently unpredictable. And his decisions are based on his expectations, of course, not on the actual future consequences. So it seems to follow that once you know Eliezer’s beliefs about the future, whether those beliefs are right or wrong is irrelevant to you: that just affects what actually happens in the future, which you systematically discount anyway.
So if Eliezer is consistent in his beliefs about the future, and his decisions are consistently grounded in those beliefs, I’m not sure what makes him any less predictable to me than you are.
Of course, his expectations might not be consistent. Or they might be consistent but beyond your ability to predict. Or his decisions might be more arbitrary than you suggest here. For that matter, he might be lying outright. I’m not saying you should necessarily trust him, or anyone else.
But those same concerns apply to everybody, whatever their professed value structure. I would say the same things about myself.
But Eliezer’s beliefs about the future continue to change—as he gains new information and completes new deductions. And there is no way that he can practically keep me informed of his beliefs—neither he nor I would be willing to invest the time required for that communication. But Eliezer’s beliefs about the future impact his actions in the present, and those actions have consequences both in the near and distant future. From my point of view, therefore, his actions have essentially random effects on the only thing that matters to me—the near future.
Absolutely. But who isn’t that true of? At least Eliezer has extensively documented his putative beliefs at various points in time, which gives you some data points to extrapolate from.
I have no complaints regarding the amount of information about Eliezer’s beliefs that I have access to. My complaint is that Eliezer, and his fellow non-discounting act utilitarians, are morally driven by the huge differences in utility which they see as arising from events in the distant future—events which I consider morally irrelevant because I discount the future. No realistic amount of information about beliefs can alleviate this problem. The only fix is for them to start discounting. (I would have added “or for me to stop discounting” except that I still don’t know how to handle the infinities.)
Given that they predominantly care about things I don’t care about, and that I predominantly care about things they don’t worry about, we can only consider each other to be moral monsters.
You and I seem to be talking past each other now. It may be time to shut this conversation down.
Ethical egoists are surely used to this situation, though. The world is full of people who care about extremely different things from one another.
Yes. And if they both mostly care about modest-sized predictable things, then they can do some rational bargaining. Trouble arises when one or more of them has exquisitely fragile values—when they believe that switching a donation from one charity to another destroys galaxies.
I expect your decision algorithm will find a way to deal with people who won’t negotiate on some topics—or who behave in manner you have a hard time predicting. Some trouble for you, maybe—but probably not THE END OF THE WORLD.
Looking at the last 10 years, there seems to be some highly-predictable fund raising activity, and a lot of philosophising about the importance of machine morality.
I see some significant patterns there. It is not remotely like a stream of random events. So: what gives?