I don’t understand the details, so this is all just guessing… but seems to me it’s something about induction and infinity. How, if you are not careful enough, you could use induction to prove something that is not true. Something like: “There is a natural number greater than 1. If there is a natural number greater than N, there is also a natural number greater than N+1. Therefore, there is a natural number greater than any natural number.” but more complicated.
The procrastination paradox seems to have a form of: “To believe that I will clean my room some day, it is not necessary to clean it today. I just have to know that starting tomorrow, I will clean it in a finite time.” Which seems reasonable, but then tomorrow, you will use the very same reasoning for postponing the cleaning yet another day. Et cetera, forever.
In the context of self-modifying agents, imagine that you are trying to build a room-cleaning AI. Is it okay to accept as a “provably room-cleaning AI” one that does not clean the room today, but tomorrow it self-modifies to a provably room-cleaning AI? If you say yes, you may build a machine that will never actually clean the room. But it is difficult to explain why “no” is the correct answer, because it seems completely harmless.
tl;dr: Don’t use infinite induction to prove that something will happen in a “finite but unspecified time”, because the limit of “finite but unspecified time” could easily be “never”.
Thank you. That matches up with that I was thinking; it’s good to get a confirmation. At first glance, it looks like a discount factor would settle the agent’s problem, but that’s only if we’re working with probabilistic beliefs and expected value rather than deterministic proofs.
Could you help me level up my understanding?
It looks like the discussion of the Procrastination Paradox in the Vingean Reflection article depends on a reflectivity property in the agent. Does that somehow bypass the Löbstable? Or if not, how is it related to Löb’s theorem?
Is there something more to the Procrastination Paradox than just “I can prove that I’ll do it tomorrow, so I won’t do it today?” By itself, that doesn’t look like an earth-shaking result.
it looks like a discount factor would settle the agent’s problem
Maybe not, if you believe that tomorrow you can self-modify into an agent that can clean the room better than you. (Better enough to offset the discount factor.)
I do not understand the Löb’s theorem, so I cannot help you here. I agree that my explanation doesn’t seem very impressive, but I cannot tell if that is because the original article is unimpressive, or because I am only able to understand the unimpressive aspects of it. :(
What is the Procrastination Paradox? I read the recent “Vingean Reflection” paper and other materials I found, but still don’t get it.
I don’t understand the details, so this is all just guessing… but seems to me it’s something about induction and infinity. How, if you are not careful enough, you could use induction to prove something that is not true. Something like: “There is a natural number greater than 1. If there is a natural number greater than N, there is also a natural number greater than N+1. Therefore, there is a natural number greater than any natural number.” but more complicated.
The procrastination paradox seems to have a form of: “To believe that I will clean my room some day, it is not necessary to clean it today. I just have to know that starting tomorrow, I will clean it in a finite time.” Which seems reasonable, but then tomorrow, you will use the very same reasoning for postponing the cleaning yet another day. Et cetera, forever.
In the context of self-modifying agents, imagine that you are trying to build a room-cleaning AI. Is it okay to accept as a “provably room-cleaning AI” one that does not clean the room today, but tomorrow it self-modifies to a provably room-cleaning AI? If you say yes, you may build a machine that will never actually clean the room. But it is difficult to explain why “no” is the correct answer, because it seems completely harmless.
tl;dr: Don’t use infinite induction to prove that something will happen in a “finite but unspecified time”, because the limit of “finite but unspecified time” could easily be “never”.
Thank you. That matches up with that I was thinking; it’s good to get a confirmation. At first glance, it looks like a discount factor would settle the agent’s problem, but that’s only if we’re working with probabilistic beliefs and expected value rather than deterministic proofs.
Could you help me level up my understanding?
It looks like the discussion of the Procrastination Paradox in the Vingean Reflection article depends on a reflectivity property in the agent. Does that somehow bypass the Löbstable? Or if not, how is it related to Löb’s theorem?
Is there something more to the Procrastination Paradox than just “I can prove that I’ll do it tomorrow, so I won’t do it today?” By itself, that doesn’t look like an earth-shaking result.
Maybe not, if you believe that tomorrow you can self-modify into an agent that can clean the room better than you. (Better enough to offset the discount factor.)
I do not understand the Löb’s theorem, so I cannot help you here. I agree that my explanation doesn’t seem very impressive, but I cannot tell if that is because the original article is unimpressive, or because I am only able to understand the unimpressive aspects of it. :(