more importantly, your utility probably doesn’t scale linearly with DALYs, if for no other reason than that you don’t care very much about things that happen at very low probabilities
less importantly, abstract arguments are much less likely to be correct than they seem at face value. Likelihood of correctness decreases exponentially in both argument length and amount of abstraction, and it is hard for us to appreciate that intuitively.
more importantly, your utility probably doesn’t scale linearly with DALYs, if for no other reason than that you don’t care very much about things that happen at very low probabilities
My life satisfaction certainly does not scale linearly with DALYs (e.g. averting the destruction of 1000 DALYs does not make me ten times as happy as averting the destruction of 100 DALYs). but does seem to be very much influenced by whether I have a sense that I’m “doing the right thing” (whatever that means).
But maybe you mean utility in some other sense than life satisfaction.
If I had the choice of pushing one of 10 buttons, each of which had different distribution of probabilities attached to magnitudes of impact, I think I would push the aggregate utility maximizing one regardless of how small the probabilities were. Would this run against my values? Maybe, I’m not sure.
less importantly, abstract arguments are much less likely to be correct than they seem at face value. Likelihood of correctness decreases exponentially in both argument length and amount of abstraction, and it is hard for us to appreciate that intuitively.
I agree; I’ve been trying to formulate this intuition in quasi-rigorous terms and have not yet succeeded in doing so.
Well I am talking about the utility defined in the VNM utility theorem, which I assumed is what the term was generally taken to mean on LW, but perhaps I am mistaken. If you mean something else by utility, then I’m unsure why you would “push the aggregate utility maximizing one” as that choice seems a bit arbitrary to me to be a hard and fast rule (except for VNM utility, since VNM utility is by definition the thing whose expected value you maximize).
Would you care to share your intuitions as to why you would push the utility maximizing button, and what you mean by utility in this case (a partial definition / example is fine if you don’t have a precise definition).
To me the claim that human-level AI → superhuman AI in at most a matter of years seems quite likely. It might not happen, but I think the arguments about FOOMing are pretty straightforward, even if not airtight. The specific timeline depends on where on the scale of Moore’s law we are (so if I thought that AI was a large source of existential risk, then I would be trying to develop AGI as quickly as possible, so that the first AGI was slow enough to stop if something bad happened; i.e. waiting longer → computers are faster → FOOM happens on a shorter timescale).
The argument I am far more skeptical of is about the likelihood of an UFAI happening without any warning. While I place some non-negligible probability on UFAI occurring, it seems like right now we know so little about AI that it is hard to judge whether an AI would actually have a significant danger of being unfriendly. By the time we are in any position to build an AGI, it should be much more obvious whether that is a problem or not.
Depending on what you meant, this might not be relevant, but. Many arguments about AGI and FOOM are antipredictions. “Argument length” as jsteinhardt used it assumes that the argument is a conjunctive one. If an argument is disjunctive then its length implies an increased likelihood of correctness. Eliezer’s “Hard Takeoff” article on OB was pretty long, but the words were used to make an antiprediction.
It is not clear to me that there are well-defined boundaries between what you call a conjunctive and a disjunctive argument. I am also not sure how two opposing predictions are not both antipredictions.
I see that some predictions are more disjunctive than others, i.e. just some of their premises need to be true. But most of the time this seems to be a result of vagueness. It doesn’t necessarily speak in favor of a prediction if it is strongly disjunctive. If you were going to pin it down it would turn out to be conjunctive, requiring all its details to be true.
All predictions are conjunctive:
If you predict that Mary is going to buy one of a thousand products in the supermarket, 1.) if she is hungry 2.) if she is thirsty 3.) if she needs a new coffee machine, then you are seemingly making a disjunctive prediction. But someone else might be less vague and make a conjunctive antiprediction. Mary is not going to buy one of a thousand products in the supermarket because 1.) she needs money 2.) she has to have some needs 3.) the supermarket has to be open. Sure, if the latter prediction was made first then the former would become the antiprediction, which happens to be disjunctive. But being disjunctive does not speak in favor of a prediction in and of itself.
All prediction are antipredictions:
Now you might argue that the first prediction could not be an antiprediction, as it does predict something to happen. But opposing predictions are always predicting the negation of each other. If you predict that Mary is going shopping then you predict that she is not not going shopping.
My actual utility, I think, does scale with DALY’s, but my hedons don’t. I’d like my hedons to match my utilons so that I can maximize both at the same time (I prefer by definition to maximize utilons if I have to pick, but this requires willpower).
Er I understand that utility != pleasure, but again, why does your utility scale linearly with DALYs? It seems like the sentiments you’ve expressed so far imply that your (ideal) utility function should not favor your own DALYs over someone else’s DALYs, but I don’t see why that implies a linear overall scaling of utility with DALYs.
If by value you mean “place utility on” then that doesn’t follow. As I said, utility has to do (among many other things) with risk aversion. You could be willing to pay twice as many dollars for twice as many DALYs and yet not place twice as much utility on twice as many DALYs. Assuming that 1 DALY = 1 utilon, then the utility of x DALYs is by definition 1/p, where p is the probability at which you would pay exactly 1 DALY to get x DALYs with probability p.
Again, having all DALYs be equally valuable doesn’t mean that your utility function scales linearly with DALYs, you could have a utility function that is say sqrt(# DALYs) and this would still value all DALYs equally. Although also see Will_Newsome’s comments elsewhere about why talking about things in terms of utility is probably not the best idea anyways.
If by utility you meant something other than VNM utility, then I apologize for the confusion (although as I pointed out elsewhere, I would then take objection to claims that you should maximize its expected value).
I’m afraid my past few comments have been confused. I don’t know as much about my utility function as I wish I did. I think I am allowed to assign positive utility to a change in my utility function, and if so then I want my utility function to be linear in DALYs. It probably is not so already.
I think we may be talking past each other (or else I’m confused). My question for you is whether you would (or wish you would) sacrifice 1 DALY in order to have a 1 in 10^50 chance of creating 1+10^50 DALYs. And if so, then why?
(If my questions are becoming tedious then feel free to ignore them.)
My question for you is whether you would (or wish you would) sacrifice 1 DALY in order to have a 1 in 10^50 chance of creating 1+10^50 DALYs. And if so, then why?
I don’t trust questions involving numbers that large and/or probabilities that small, but I think so, yes.
I think there are two things going on here:
more importantly, your utility probably doesn’t scale linearly with DALYs, if for no other reason than that you don’t care very much about things that happen at very low probabilities
less importantly, abstract arguments are much less likely to be correct than they seem at face value. Likelihood of correctness decreases exponentially in both argument length and amount of abstraction, and it is hard for us to appreciate that intuitively.
Thanks.
My life satisfaction certainly does not scale linearly with DALYs (e.g. averting the destruction of 1000 DALYs does not make me ten times as happy as averting the destruction of 100 DALYs). but does seem to be very much influenced by whether I have a sense that I’m “doing the right thing” (whatever that means).
But maybe you mean utility in some other sense than life satisfaction.
If I had the choice of pushing one of 10 buttons, each of which had different distribution of probabilities attached to magnitudes of impact, I think I would push the aggregate utility maximizing one regardless of how small the probabilities were. Would this run against my values? Maybe, I’m not sure.
I agree; I’ve been trying to formulate this intuition in quasi-rigorous terms and have not yet succeeded in doing so.
Well I am talking about the utility defined in the VNM utility theorem, which I assumed is what the term was generally taken to mean on LW, but perhaps I am mistaken. If you mean something else by utility, then I’m unsure why you would “push the aggregate utility maximizing one” as that choice seems a bit arbitrary to me to be a hard and fast rule (except for VNM utility, since VNM utility is by definition the thing whose expected value you maximize).
Would you care to share your intuitions as to why you would push the utility maximizing button, and what you mean by utility in this case (a partial definition / example is fine if you don’t have a precise definition).
Does that apply to AI going FOOM?
To me the claim that human-level AI → superhuman AI in at most a matter of years seems quite likely. It might not happen, but I think the arguments about FOOMing are pretty straightforward, even if not airtight. The specific timeline depends on where on the scale of Moore’s law we are (so if I thought that AI was a large source of existential risk, then I would be trying to develop AGI as quickly as possible, so that the first AGI was slow enough to stop if something bad happened; i.e. waiting longer → computers are faster → FOOM happens on a shorter timescale).
The argument I am far more skeptical of is about the likelihood of an UFAI happening without any warning. While I place some non-negligible probability on UFAI occurring, it seems like right now we know so little about AI that it is hard to judge whether an AI would actually have a significant danger of being unfriendly. By the time we are in any position to build an AGI, it should be much more obvious whether that is a problem or not.
Might you clarify your question?
Depending on what you meant, this might not be relevant, but. Many arguments about AGI and FOOM are antipredictions. “Argument length” as jsteinhardt used it assumes that the argument is a conjunctive one. If an argument is disjunctive then its length implies an increased likelihood of correctness. Eliezer’s “Hard Takeoff” article on OB was pretty long, but the words were used to make an antiprediction.
It is not clear to me that there are well-defined boundaries between what you call a conjunctive and a disjunctive argument. I am also not sure how two opposing predictions are not both antipredictions.
I see that some predictions are more disjunctive than others, i.e. just some of their premises need to be true. But most of the time this seems to be a result of vagueness. It doesn’t necessarily speak in favor of a prediction if it is strongly disjunctive. If you were going to pin it down it would turn out to be conjunctive, requiring all its details to be true.
All predictions are conjunctive:
If you predict that Mary is going to buy one of a thousand products in the supermarket, 1.) if she is hungry 2.) if she is thirsty 3.) if she needs a new coffee machine, then you are seemingly making a disjunctive prediction. But someone else might be less vague and make a conjunctive antiprediction. Mary is not going to buy one of a thousand products in the supermarket because 1.) she needs money 2.) she has to have some needs 3.) the supermarket has to be open. Sure, if the latter prediction was made first then the former would become the antiprediction, which happens to be disjunctive. But being disjunctive does not speak in favor of a prediction in and of itself.
All prediction are antipredictions:
Now you might argue that the first prediction could not be an antiprediction, as it does predict something to happen. But opposing predictions are always predicting the negation of each other. If you predict that Mary is going shopping then you predict that she is not not going shopping.
I’d reverse the importance of those two considerations. Even though my utility doesn’t scale linearly with DALYs, I wish it did.
Why do you wish it did?
My actual utility, I think, does scale with DALY’s, but my hedons don’t. I’d like my hedons to match my utilons so that I can maximize both at the same time (I prefer by definition to maximize utilons if I have to pick, but this requires willpower).
Er I understand that utility != pleasure, but again, why does your utility scale linearly with DALYs? It seems like the sentiments you’ve expressed so far imply that your (ideal) utility function should not favor your own DALYs over someone else’s DALYs, but I don’t see why that implies a linear overall scaling of utility with DALYs.
If I think all DALYs are equally valuable, I should value twice as many twice as much. That’s why I’d prefer it to be linear.
If by value you mean “place utility on” then that doesn’t follow. As I said, utility has to do (among many other things) with risk aversion. You could be willing to pay twice as many dollars for twice as many DALYs and yet not place twice as much utility on twice as many DALYs. Assuming that 1 DALY = 1 utilon, then the utility of x DALYs is by definition 1/p, where p is the probability at which you would pay exactly 1 DALY to get x DALYs with probability p.
Again, having all DALYs be equally valuable doesn’t mean that your utility function scales linearly with DALYs, you could have a utility function that is say sqrt(# DALYs) and this would still value all DALYs equally. Although also see Will_Newsome’s comments elsewhere about why talking about things in terms of utility is probably not the best idea anyways.
If by utility you meant something other than VNM utility, then I apologize for the confusion (although as I pointed out elsewhere, I would then take objection to claims that you should maximize its expected value).
I’m afraid my past few comments have been confused. I don’t know as much about my utility function as I wish I did. I think I am allowed to assign positive utility to a change in my utility function, and if so then I want my utility function to be linear in DALYs. It probably is not so already.
I think we may be talking past each other (or else I’m confused). My question for you is whether you would (or wish you would) sacrifice 1 DALY in order to have a 1 in 10^50 chance of creating 1+10^50 DALYs. And if so, then why?
(If my questions are becoming tedious then feel free to ignore them.)
I don’t trust questions involving numbers that large and/or probabilities that small, but I think so, yes.
Probably good not to trust such number =). But can you share any reasoning or intuition for why the answer is yes?