It’s always better for everyone if the loser surrenders before the fight begins. And since it saves the winner some resources, the surrendered loser gets a corresponding bonus. If there is a plan that gets better results, as a rule of thumb you should expect AGIs to do no worse than this plan allows (even if you have no idea how they could coordinate to follow this plan).
But what if the two AGIs were a literal paperclip maximizer and a literal staple maximizer? Suppose that the paperclip maximizer controlled 70% of the resources and calculated that it had a 90% chance of winning a fight. Then the paperclip maximizer would maximize expected number of paperclips by initiating a fight.
Now, obviously I don’t believe that we’ll see a literal paperclip maximizer or a literal staple maximizer, but do we have any reason to believe that the AGIs that arose in practice would act differently? Or that trading would systematically produce higher expected value than fighting?
“Fighting” is a narrow class of strategies, while in “trading” I include a strictly greater class of strategies, hence expectation of there being a better strategy within “trading”.
Suppose that the paperclip maximizer controlled 70% of the resources and calculated that it had a 90% chance of winning a fight [against staple maximizer]. Then the paperclip maximizer would maximize expected number of paperclips by initiating a fight.
But they’ll be even better off without a fight, with staple maximizer surrendering most of its control outright, or, depending on disposition (preference) towards risk, deciding the outcome with a random number and then orderly following what the random number decided.
Okay, I think I finally understand where you’re coming from. Thanks for the interesting conversation! I will spend some time digesting your remarks so as to figure out whether I agree with you and then update my top level post accordingly. You may have convinced me that the negative effects associated with sending signals into space are trivial.
I think (but am not sure) that the one remaining issue in my mind is the question of whether an AGI could somehow destroy human civilization from far away upon learning of our existence.
I think that Vladimir’s points were valid, but that they definitely shouldn’t have convinced you that the negative effects associated with sending signals into space are trivial (except in the trivial sense that no-one is likely to receive them).
Actually, your comment and Vladimir’s comment highlight a potential opportunity for me to improve my rationality.
•I’ve noticed that when I believe A and when somebody presents me with credible evidence against A, I have a tendency to alter my belief to “not A” even when the evidence against A is too small to warrant such a transition.
I think that my thought process is something like “I said that I believe A, and in response person X presented credible evidence against A which I wasn’t aware of. The fact that person X has evidence against A which I wasn’t aware of is evidence that person X is thinking more clearly about the topic than I am. The fact that person X took the time to convey evidence against A is an indication that person X does not believe A. Therefore, I should not believe A either.”
This line of thought is not totally without merit, but I take it too far.
(1) Just because somebody makes a point that didn’t occur to me doesn’t mean that that they’re thinking more clearly about the topic than I am.
(2) Just because somebody makes a point that pushes against my current view doesn’t mean that the person disagrees with my current view.
On (2), if Vladimir had prefaced his remarks with the disclaimer “I still think that it’s worthwhile to think about attracting the attention of aliens as an existential risk, but here are some reasons why it might not be as worthwhile as it presently looks to you” then I would not have had such a volatile reaction to his remark—the strength of my reaction was somehow predicated on the idea that he believed that I was wrong to draw attention to “attracting the attention of aliens as an existential risk.”
If possible, I would like to overcome the issue labeled with a • above. I don’t know whether I can, but I would welcome any suggestions. Do you know of any specific Less Wrong posts that might be relevant?
Changing your mind too often is better than changing your mind too rarely, if on the net you manage to be confluent: if you change your mind by mistake, you can change it back later.
(I do believe that it’s not worthwhile to worry about attracting attention of aliens—if that isn’t clear—though it’s a priori worthwhile to think about whether it’s a risk. I’d guess Eliezer will be more conservative on such an issue and won’t rely on an apparently simple conclusion that it’s safe, declaring it dangerous until FAI makes a competent decision either way. I agree that it’s a negative-utility action though, just barely negative due to unknown unknowns.)
Just because somebody makes a point that pushes against my current view doesn’t mean that the person disagrees with my current view.
Actually that is a good heuristic for understanding most people. Only horribly pedantic people like myself tend to volunteer evidence against our own beliefs.
Yes, I think you’re right. The people on LessWrong are unusual. Even so, even when speaking to members of the general population, sometimes one will misinterpret the things that they say as evidence of certain beliefs. (They may be offering evidence to support their beliefs, but I may misinterpret which of their beliefs they’re offering evidence in support of).
Thanks for your remark. I agree that what I said in my last comment is too strong.
I’m not convinced that the negative effects associated with sending signals into space are trivial, but Vladimir’s remarks did meaningfully lower my level of confidence in the notion that a really powerful optimization process would go out of its way to attack Earth in response to receiving a signal from us.
To me that conclusion also didn’t sound to be in the right place, but we did begin the discussion from that assertion, and there are arguments for that at the beginning of the discussion (not particularly related to where this thread went). Maybe something we cleared out helped with those arguments indirectly.
Isn’t this a Hawk-Dove situation, where pre-committing to fight even if you’ll probably lose could be in some AGI’s interests, by deterring others from fighting them?
Threats are not made to be carried out. Possibility of actual fighting sets the rules of the game, worst-case scenario which the actual play will improve on, to an extent for each player depending on the outcome of the bargaining aspect of the game.
For a threat to be significant, it has to be believed. In the case of AGI, this probably means the AGI itself being unable to renege on the threat. If two such met, wouldn’t fighting be inevitable? If so, how do we know it wouldn’t be worthwhile for at least some AGIs to make such a threat, sometimes?
Then again, ‘Maintain control of my current level of resources’ could be a schelling point that prevents descent into conflict.
But it’s not obvious why an AGI would choose to draw their line in the sand their though, when ‘current resources plus epsilon% of the commons’ is available. The main use of schelling points in human games is to create a more plausible threat, whereas an AGI could just show its source code.
It’s always better for everyone if the loser surrenders before the fight begins. And since it saves the winner some resources, the surrendered loser gets a corresponding bonus. If there is a plan that gets better results, as a rule of thumb you should expect AGIs to do no worse than this plan allows (even if you have no idea how they could coordinate to follow this plan).
I would like to believe that you’re right.
But what if the two AGIs were a literal paperclip maximizer and a literal staple maximizer? Suppose that the paperclip maximizer controlled 70% of the resources and calculated that it had a 90% chance of winning a fight. Then the paperclip maximizer would maximize expected number of paperclips by initiating a fight.
Now, obviously I don’t believe that we’ll see a literal paperclip maximizer or a literal staple maximizer, but do we have any reason to believe that the AGIs that arose in practice would act differently? Or that trading would systematically produce higher expected value than fighting?
“Fighting” is a narrow class of strategies, while in “trading” I include a strictly greater class of strategies, hence expectation of there being a better strategy within “trading”.
But they’ll be even better off without a fight, with staple maximizer surrendering most of its control outright, or, depending on disposition (preference) towards risk, deciding the outcome with a random number and then orderly following what the random number decided.
Okay, I think I finally understand where you’re coming from. Thanks for the interesting conversation! I will spend some time digesting your remarks so as to figure out whether I agree with you and then update my top level post accordingly. You may have convinced me that the negative effects associated with sending signals into space are trivial.
I think (but am not sure) that the one remaining issue in my mind is the question of whether an AGI could somehow destroy human civilization from far away upon learning of our existence.
I think that Vladimir’s points were valid, but that they definitely shouldn’t have convinced you that the negative effects associated with sending signals into space are trivial (except in the trivial sense that no-one is likely to receive them).
Actually, your comment and Vladimir’s comment highlight a potential opportunity for me to improve my rationality.
•I’ve noticed that when I believe A and when somebody presents me with credible evidence against A, I have a tendency to alter my belief to “not A” even when the evidence against A is too small to warrant such a transition.
I think that my thought process is something like “I said that I believe A, and in response person X presented credible evidence against A which I wasn’t aware of. The fact that person X has evidence against A which I wasn’t aware of is evidence that person X is thinking more clearly about the topic than I am. The fact that person X took the time to convey evidence against A is an indication that person X does not believe A. Therefore, I should not believe A either.”
This line of thought is not totally without merit, but I take it too far.
(1) Just because somebody makes a point that didn’t occur to me doesn’t mean that that they’re thinking more clearly about the topic than I am.
(2) Just because somebody makes a point that pushes against my current view doesn’t mean that the person disagrees with my current view.
On (2), if Vladimir had prefaced his remarks with the disclaimer “I still think that it’s worthwhile to think about attracting the attention of aliens as an existential risk, but here are some reasons why it might not be as worthwhile as it presently looks to you” then I would not have had such a volatile reaction to his remark—the strength of my reaction was somehow predicated on the idea that he believed that I was wrong to draw attention to “attracting the attention of aliens as an existential risk.”
If possible, I would like to overcome the issue labeled with a • above. I don’t know whether I can, but I would welcome any suggestions. Do you know of any specific Less Wrong posts that might be relevant?
Changing your mind too often is better than changing your mind too rarely, if on the net you manage to be confluent: if you change your mind by mistake, you can change it back later.
(I do believe that it’s not worthwhile to worry about attracting attention of aliens—if that isn’t clear—though it’s a priori worthwhile to think about whether it’s a risk. I’d guess Eliezer will be more conservative on such an issue and won’t rely on an apparently simple conclusion that it’s safe, declaring it dangerous until FAI makes a competent decision either way. I agree that it’s a negative-utility action though, just barely negative due to unknown unknowns.)
Actually that is a good heuristic for understanding most people. Only horribly pedantic people like myself tend to volunteer evidence against our own beliefs.
Yes, I think you’re right. The people on LessWrong are unusual. Even so, even when speaking to members of the general population, sometimes one will misinterpret the things that they say as evidence of certain beliefs. (They may be offering evidence to support their beliefs, but I may misinterpret which of their beliefs they’re offering evidence in support of).
And in any case, my point (1) above still stands.
Thanks for your remark. I agree that what I said in my last comment is too strong.
I’m not convinced that the negative effects associated with sending signals into space are trivial, but Vladimir’s remarks did meaningfully lower my level of confidence in the notion that a really powerful optimization process would go out of its way to attack Earth in response to receiving a signal from us.
To me that conclusion also didn’t sound to be in the right place, but we did begin the discussion from that assertion, and there are arguments for that at the beginning of the discussion (not particularly related to where this thread went). Maybe something we cleared out helped with those arguments indirectly.
Isn’t this a Hawk-Dove situation, where pre-committing to fight even if you’ll probably lose could be in some AGI’s interests, by deterring others from fighting them?
Threats are not made to be carried out. Possibility of actual fighting sets the rules of the game, worst-case scenario which the actual play will improve on, to an extent for each player depending on the outcome of the bargaining aspect of the game.
For a threat to be significant, it has to be believed. In the case of AGI, this probably means the AGI itself being unable to renege on the threat. If two such met, wouldn’t fighting be inevitable? If so, how do we know it wouldn’t be worthwhile for at least some AGIs to make such a threat, sometimes?
Then again, ‘Maintain control of my current level of resources’ could be a schelling point that prevents descent into conflict.
But it’s not obvious why an AGI would choose to draw their line in the sand their though, when ‘current resources plus epsilon% of the commons’ is available. The main use of schelling points in human games is to create a more plausible threat, whereas an AGI could just show its source code.
An AGI won’t turn itself into a defecting rock, when there is a possibility of pareto improvement over that.