Tuning one’s preference function is a constrained optimization problem. What I want is a preference function simple enough for my very finite brain to be able to compute it in real time, and that does a good job (whatever exactly that means) on-some-kind-of-average over some plausible probability distribution of scenarios it’s actually going to have to deal with.
Choosing between torturing one person for 50 years and giving 3^^^3 people minimally-disturbing dust specks is a long, long way outside the range of scenarios that have non-negligible probability of actually coming up. It’s a long, long way outside the range of scenarios that my decision-theoretic intuition has been tuned on by a few million years of evolution and a few decades of experience.
My preference function returns values with (something a bit like) error bars on them. In this case, the error bars are much larger than the values: there’s much more noise than signal. That’s a defect, no doubt about it: a perfect preference function would never do that. A perfect preference function is probably also unattainable, given the limitations of my brain.
What possible reason is there for supposing that my preference function would be improved, for the actual problems it actually gets used for, by nailing down its behaviour far outside the useful range?
If there were good reason to think that decision theory is like (a Platonist’s view of) logic, with a Right Answer to every question and no limits to its validity, then there would be reason to expect that nailing down my preference function’s values out in la-la-land would be useful. But is there? Not that I know of. Decision theory is an abstraction of actual human preferences. Applying it to problems like Eliezer’s might be like extrapolating quantum mechanics down to a scale of 10^-(10^100) m.
Tuning one’s preference function is a constrained optimization problem. What I want is a preference function simple enough for my very finite brain to be able to compute it in real time, and that does a good job (whatever exactly that means) on-some-kind-of-average over some plausible probability distribution of scenarios it’s actually going to have to deal with.
Choosing between torturing one person for 50 years and giving 3^^^3 people minimally-disturbing dust specks is a long, long way outside the range of scenarios that have non-negligible probability of actually coming up. It’s a long, long way outside the range of scenarios that my decision-theoretic intuition has been tuned on by a few million years of evolution and a few decades of experience.
My preference function returns values with (something a bit like) error bars on them. In this case, the error bars are much larger than the values: there’s much more noise than signal. That’s a defect, no doubt about it: a perfect preference function would never do that. A perfect preference function is probably also unattainable, given the limitations of my brain.
What possible reason is there for supposing that my preference function would be improved, for the actual problems it actually gets used for, by nailing down its behaviour far outside the useful range?
If there were good reason to think that decision theory is like (a Platonist’s view of) logic, with a Right Answer to every question and no limits to its validity, then there would be reason to expect that nailing down my preference function’s values out in la-la-land would be useful. But is there? Not that I know of. Decision theory is an abstraction of actual human preferences. Applying it to problems like Eliezer’s might be like extrapolating quantum mechanics down to a scale of 10^-(10^100) m.