I suspect much of the problem is that humans aren’t very good at consistency nor calculation. Scope insensitivity (and other errors) cause us to accept steps that lead to incorrect results once aggregated. If you can actually define your units and measurements, I strongly expect that the sum of the steps will equal the conclusion, and you will be able to identify the steps which are unacceptible (or accept the conclusion).
I’d advise against the motivated reasoning that “if you don’t like the conclusion, you have to find a step to reject”, but rather the “I notice I’m confused that I have different evaluations of the steps and the aggregate, so I’ve probably miscalculated.”
And if this is the case (the mismatch is caused by compounded rounding errors rather than a fundamental disconnect), then it seems unlikely to be a useful solution to AI problems. Unless we fix the problems in our calculation, not just use the method we’ve proven doesn’t work.
“I notice I’m confused that I have different evaluations of the steps and the aggregate, so I’ve probably miscalculated.”
But then you have to choose either to correct the steps to get in line with the aggregate (integral reasoning), or the aggregate to get in line with the steps (differential reasoning).
Inconsistency shows that there is at least one error, it does not imply (actually, it gives some evidence against) that either calculation is correct. You can’t choose which one to adjust to fit the other, you have to correct all errors. Remember, consistency isn’t a goal in itself, it’s just a bit of evidence for correctness.
For the specific case in point, the error is likely around not being numerical in the individual steps—how much better is the universe with one additional low-but-positive life added? How much (if any) is the universe improved by a specific redistribution of life-quality? Without that, you can’t know if any of the steps are valid and you can’t know if the conclusion is valid.
These are values we’re talking about—the proof is a proof of inconsistency between two value sets, and you have to choose which parts of your values to give up, and how. Your choice of how to be numerical in each step determines which values you’re keeping.
I think we agree on the basics—the specificity of calculation allows you to identify exactly what you’re considering, and find out what the mismatch is (missing a step, making an incorrect step, and/or mis-stating the summation). This is true for values as well as factual beliefs.
It is only after this that you understand your proposed values well enough to know whether they are different value-sets, or just a calculation mistake in one or both. Once you know that, then you can decide which, if either, apply to you.
I guess you should also separately decide if it’s good and important for you to think you’re a unitary individual, vs a series of semi-connected experiences. Do you (singlular you) want to have a single consistent set of values, or are all the future you-components content to behave somewhat randomly over time and context. This is mostly assumed in this kind of discussion, but probably worth stating if you’re questioning what (if anything) you learn from an inconsistency.
You can say anything, but Graham’s number is very large; if the disutility of an air molecule slamming into your eye were 1 over Graham’s number, enough air pressure to kill you would have negligible disutility.
It occurs to me that this sounds a lot like the problem with the linear scaling used by “utilitarianism”: “paradox” of the heap or not, things can have very different effects when they come in large numbers or very small numbers. You really should not have a utility function that rates “disutility of an air molecule slamming into your eye” and then scales up linearly with the number of molecules, precisely because one molecule has no measurable effect on you, while an immense number (eg: a tornado, for instance) can and will kill you.
When you assume linear scaling of utility as an axiom (that “utilons” are an in-model real scalar), you are actually throwing out the causal interactions involving the chosen variable (eg: real-world object embodying a “utilon”) that scale in nonlinear ways. The axiom is actually telling you to ignore part of the way reality works just to get a simpler “normative” model.
So in a typical, intuitive case, we assume that “maximizing happiness” means some actually-existing agent actually experiences the additional happiness. But when you instead have a Utilitarian AI that adds happiness by adding not-quite-clinically-depressed people, the map of “utility maximizing” as “making the individual experiences more enjoyable” has ceased to match the territory of “increase the number of individuals until the carrying capacity of the environment is reached”. A nonlinear scaling effect happened—you created so many people that they can’t be individually very happy—but the “normativity” of the linear-utilons axiom told your agent to ignore it.
I think a strong criterion for a True Ethical System should be precisely that it doesn’t “force” you to ignore the causal joints of reality.
I suspect much of the problem is that humans aren’t very good at consistency nor calculation. Scope insensitivity (and other errors) cause us to accept steps that lead to incorrect results once aggregated. If you can actually define your units and measurements, I strongly expect that the sum of the steps will equal the conclusion, and you will be able to identify the steps which are unacceptible (or accept the conclusion).
I’d advise against the motivated reasoning that “if you don’t like the conclusion, you have to find a step to reject”, but rather the “I notice I’m confused that I have different evaluations of the steps and the aggregate, so I’ve probably miscalculated.”
And if this is the case (the mismatch is caused by compounded rounding errors rather than a fundamental disconnect), then it seems unlikely to be a useful solution to AI problems. Unless we fix the problems in our calculation, not just use the method we’ve proven doesn’t work.
But then you have to choose either to correct the steps to get in line with the aggregate (integral reasoning), or the aggregate to get in line with the steps (differential reasoning).
Inconsistency shows that there is at least one error, it does not imply (actually, it gives some evidence against) that either calculation is correct. You can’t choose which one to adjust to fit the other, you have to correct all errors. Remember, consistency isn’t a goal in itself, it’s just a bit of evidence for correctness.
For the specific case in point, the error is likely around not being numerical in the individual steps—how much better is the universe with one additional low-but-positive life added? How much (if any) is the universe improved by a specific redistribution of life-quality? Without that, you can’t know if any of the steps are valid and you can’t know if the conclusion is valid.
These are values we’re talking about—the proof is a proof of inconsistency between two value sets, and you have to choose which parts of your values to give up, and how. Your choice of how to be numerical in each step determines which values you’re keeping.
I think we agree on the basics—the specificity of calculation allows you to identify exactly what you’re considering, and find out what the mismatch is (missing a step, making an incorrect step, and/or mis-stating the summation). This is true for values as well as factual beliefs.
It is only after this that you understand your proposed values well enough to know whether they are different value-sets, or just a calculation mistake in one or both. Once you know that, then you can decide which, if either, apply to you.
I guess you should also separately decide if it’s good and important for you to think you’re a unitary individual, vs a series of semi-connected experiences. Do you (singlular you) want to have a single consistent set of values, or are all the future you-components content to behave somewhat randomly over time and context. This is mostly assumed in this kind of discussion, but probably worth stating if you’re questioning what (if anything) you learn from an inconsistency.
Hey, a thought occurred. I was random-browsing The Intuitions Behind Utiliarianism and saw the following:
It occurs to me that this sounds a lot like the problem with the linear scaling used by “utilitarianism”: “paradox” of the heap or not, things can have very different effects when they come in large numbers or very small numbers. You really should not have a utility function that rates “disutility of an air molecule slamming into your eye” and then scales up linearly with the number of molecules, precisely because one molecule has no measurable effect on you, while an immense number (eg: a tornado, for instance) can and will kill you.
When you assume linear scaling of utility as an axiom (that “utilons” are an in-model real scalar), you are actually throwing out the causal interactions involving the chosen variable (eg: real-world object embodying a “utilon”) that scale in nonlinear ways. The axiom is actually telling you to ignore part of the way reality works just to get a simpler “normative” model.
So in a typical, intuitive case, we assume that “maximizing happiness” means some actually-existing agent actually experiences the additional happiness. But when you instead have a Utilitarian AI that adds happiness by adding not-quite-clinically-depressed people, the map of “utility maximizing” as “making the individual experiences more enjoyable” has ceased to match the territory of “increase the number of individuals until the carrying capacity of the environment is reached”. A nonlinear scaling effect happened—you created so many people that they can’t be individually very happy—but the “normativity” of the linear-utilons axiom told your agent to ignore it.
I think a strong criterion for a True Ethical System should be precisely that it doesn’t “force” you to ignore the causal joints of reality.