It is definitely a problem with infinite buck-passing. It is probably possible to prove optimality if we have a continuous utility function (e.g. we’re using discounting). I think we might actually want a continuous utility function, but maybe not. Is there any time t such that you would consider it almost as good for a wonderful human civilization to exist for t steps and then die, compared to existing indefinitely?
The way I would express the procrastination paradox is something like:
There’s the tiling agents problem: we want AIs to construct successors that they trust to make correct decisions.
It would be desirable to have a system where an infinite sequence of AIs each trust the next one. If it worked, this would solve the tiling agents problem.
But, if we have something like this, then it will be unsound: it will prove that the button will eventually get pressed, even though it will never actually get pressed.
We can construct things that do press the button, but they don’t have the property of trusting successors that is desirable in some ways. Due to their handling of recursion, Paul’s logic and reflective oracles are both candidates for solving the tiling agents problem, however they both fail the procrastination paradox (when it’s set up this way).
Cool, thanks; sounds like I have about the same picture. One missing ingredient for me that was resolved by your answer, and by going back and looking at the papers again, was the distinction between consistency and soundness (on the natural numbers), which is not a distinction I think about often.
In case it’s useful, I’ll note that the procrastination paradox is hard for me to take seriously on an intuitive level, because some part of me thinks that requiring correct answers in infinite decision problems is unreasonable; so many reasoning systems fail on these problems, and infinite situations seem so unlikely, that they are hard for me to get worked up about. This isn’t so much a comment on how important the problem actually is, but more about how much argumentation may be required to convince people like me that they’re actually worth working on.
It is definitely a problem with infinite buck-passing. It is probably possible to prove optimality if we have a continuous utility function (e.g. we’re using discounting). I think we might actually want a continuous utility function, but maybe not. Is there any time t such that you would consider it almost as good for a wonderful human civilization to exist for t steps and then die, compared to existing indefinitely?
The way I would express the procrastination paradox is something like:
There’s the tiling agents problem: we want AIs to construct successors that they trust to make correct decisions.
It would be desirable to have a system where an infinite sequence of AIs each trust the next one. If it worked, this would solve the tiling agents problem.
But, if we have something like this, then it will be unsound: it will prove that the button will eventually get pressed, even though it will never actually get pressed.
We can construct things that do press the button, but they don’t have the property of trusting successors that is desirable in some ways. Due to their handling of recursion, Paul’s logic and reflective oracles are both candidates for solving the tiling agents problem, however they both fail the procrastination paradox (when it’s set up this way).
Cool, thanks; sounds like I have about the same picture. One missing ingredient for me that was resolved by your answer, and by going back and looking at the papers again, was the distinction between consistency and soundness (on the natural numbers), which is not a distinction I think about often.
In case it’s useful, I’ll note that the procrastination paradox is hard for me to take seriously on an intuitive level, because some part of me thinks that requiring correct answers in infinite decision problems is unreasonable; so many reasoning systems fail on these problems, and infinite situations seem so unlikely, that they are hard for me to get worked up about. This isn’t so much a comment on how important the problem actually is, but more about how much argumentation may be required to convince people like me that they’re actually worth working on.