Another plausible normalization (that seems more likely to yield sane behavior in practice) is to set the value of $1 to be the same for every theory. This has its own problems, but it seems much better than min-max to me, since it avoids having our behavior be determined by extremely countefractual counterfactuals. What do you think is the strongest argument for min-max over constant values for $1?
I think the best argument against constant-value-of-$1 is that it has its own risk of giving pathological answers for theories that really don’t care about dollars. You’d want to ameliorate that by using a very broad basket of resources, e.g. a small sliver of everything you own. Giving high weight to a theory which has “nothing to gain” doesn’t seem as scary though, since hopefully it’s not going to ask for any of your resources. (Unlike min-max’s “foible” of giving overwhelming weight to theories that happen to not care about anything bad happening...)
It’s easier for me to see how we could argue that (max-actual) is better than “constant $1.”
(ETA: these two proposals both make the same sign error, I was acting as if making a theory better off reduces its weight, but obviously it increases its weight.)
Another option is to punt very explicitly on the aggregation: allow partial negotiation today (say, giving each theory 50% of the resources), but delegate the bulk of the negotiation between value systems to the future. Basically doing the same thing we’d do if we actually had an organization consisting of 5 people with different values.
In general, it seems like we should trying to preserve the option value of doing aggregation in the future using a better understanding of how to aggregate. So we should be evaluating our theories by how well they work in the interim rather than e.g. aesthetic considerations.
What do you think is the strongest argument for min-max over constant values for $1?
The constant $1 is the marginal current utility of the function, which is a reflection of its local properties only (very close utilities can have very different weightings), while min-max looks at its global properties.
The min-max is in expected utility given a policy, not in maximal utility that could happen, so it’s a bit less stupid than it would be in the second case.
1. In general there are diminishing returns to dollars, so global properties constrain local properties. (This is very true if you can gamble)
2. Your actual decisions mostly concern local changes, so it seems like a not-crazy thing to base your policy on.
That said, this proposal suffers from me making the same sign error as the (max-actual) proposal. Consider a theory with log utility in the number of dollars spent on it. As you spend less on it, its utility per dollar goes up and the weight goes down, so you further decrease the number of dollars, in the limit it has 0 dollars and infinite utility per dollar.
(It still seems like a sane approach for value learning, but not for moral uncertainty.)
Another plausible normalization (that seems more likely to yield sane behavior in practice) is to set the value of $1 to be the same for every theory. This has its own problems, but it seems much better than min-max to me, since it avoids having our behavior be determined by extremely countefractual counterfactuals. What do you think is the strongest argument for min-max over constant values for $1?
I think the best argument against constant-value-of-$1 is that it has its own risk of giving pathological answers for theories that really don’t care about dollars. You’d want to ameliorate that by using a very broad basket of resources, e.g. a small sliver of everything you own. Giving high weight to a theory which has “nothing to gain” doesn’t seem as scary though, since hopefully it’s not going to ask for any of your resources. (Unlike min-max’s “foible” of giving overwhelming weight to theories that happen to not care about anything bad happening...)
It’s easier for me to see how we could argue that (max-actual) is better than “constant $1.”
(ETA: these two proposals both make the same sign error, I was acting as if making a theory better off reduces its weight, but obviously it increases its weight.)
Another option is to punt very explicitly on the aggregation: allow partial negotiation today (say, giving each theory 50% of the resources), but delegate the bulk of the negotiation between value systems to the future. Basically doing the same thing we’d do if we actually had an organization consisting of 5 people with different values.
In general, it seems like we should trying to preserve the option value of doing aggregation in the future using a better understanding of how to aggregate. So we should be evaluating our theories by how well they work in the interim rather than e.g. aesthetic considerations.
The constant $1 is the marginal current utility of the function, which is a reflection of its local properties only (very close utilities can have very different weightings), while min-max looks at its global properties.
The min-max is in expected utility given a policy, not in maximal utility that could happen, so it’s a bit less stupid than it would be in the second case.
Well:
1. In general there are diminishing returns to dollars, so global properties constrain local properties. (This is very true if you can gamble)
2. Your actual decisions mostly concern local changes, so it seems like a not-crazy thing to base your policy on.
That said, this proposal suffers from me making the same sign error as the (max-actual) proposal. Consider a theory with log utility in the number of dollars spent on it. As you spend less on it, its utility per dollar goes up and the weight goes down, so you further decrease the number of dollars, in the limit it has 0 dollars and infinite utility per dollar.
(It still seems like a sane approach for value learning, but not for moral uncertainty.)