The main relevant aspect of the real problem not captured by quining is the continuity of self-replacement with environmental action.
Right, so we want an AI design that does the right thing “naturally”, without a special case for “modify self”. It seems to me that UDT may already be such a design, if we had a solution to logical uncertainty (what I called “math intuition module”) to plug into it. When I wrote about logical uncertainty being important for AGI, I had in mind the problem of not being able to prove P!=NP and hence not being able to prove whether EU(search for polynomial solution to 3-SAT) < EU(not search for polynomial solution to 3-SAT). But if we had an AI that can handle that kind of decision problems, why wouldn’t it also be able to design and build another AI that uses a proof system that it can’t prove to be sound? Is there a reason to think that “AI reflection” isn’t just a special case of “logical uncertainty”?
(Of course UDT would probably build successors/copies of itself that also use “math intuition modules” rather than “proof systems” but we can imagine forcing a UDT-AI to choose between a selection of other AIs that are based on different proof systems, in which it would choose based on an analysis of the trade-off between power vs risk of inconsistency/unsoundness, like how a human would make the choice.)
I can’t visualize how saying that a result is “uncertain” could make the Lobian issue go away—do you have a concrete visualization for this, and if so, can you sketch it even if very briefly?
Possibly relevant: No Turing machine can assign finite probability to the halting sequence.
(Regardless of whether UDT1.1 is a correct quining solution to AI Reflection, we ultimately still need something that is not so “brute force”, so let’s continue this thread.)
I can’t visualize how saying that a result is “uncertain” could make the Lobian issue go away—do you have a concrete visualization for this, and if so, can you sketch it even if very briefly?
I guess I’m saying that there’s not necessarily a Lob’s theorem or equivalent for a system that (does something like) assigns probabilities to mathematical statements instead of proving/disproving them from a fixed set of axioms. Do you think there is such an equivalent to Lob’s theorem, and if so what might it say?
Possibly relevant: No Turing machine can assign finite probability to the halting sequence.
I’m not seeing why this is relevant. Can you explain?
Do you think there is such an equivalent to Lob’s theorem, and if so what might it say?
In the Q&A to the open problems talk, he said that if logical uncertainty is the solution to the problem, then he and Marcello didn’t figure out how, and they did look in that specific place.
I think that it would be useful to try to make some progress about how to reason probabilistically about logical statements before trying to use it as the solution to other problems, because my picture of how it could work is so unclear that I feel unable to draw any conclusions about what it can or cannot do with any kind of certainty. For example, I do not know a condition on a probability distribution over mathematical statements that would prevent the spurious proof problem—when I read about playing chicken with the universe, it sounded very familiar, but I can’t now see a way to apply it in the case of a probability distribution. (Then again, if we do expect to find a solution to the Löbian problems in that area, perhaps we should be working on logical uncertainty now, not on Löbian problems...)
(Then again, if we do expect to find a solution to the Löbian problems in that area, perhaps we should be working on logical uncertainty now, not on Löbian problems...)
If it weren’t for concerns about AI risk, I would be advocating working on logical uncertainty instead of Löbian problems, because it seems to me that an AI using “math intuition” may not face a Löbian problem to begin with, or if it does, the problem and/or solution may be different enough from that of an AI using a proof system that any work we do now will not be of much help.
(From the perspective of reducing AI risk, maybe it’s good to focus people’s attention on Löbian problems, unless it leads them to start thinking that they ought to work on logical uncertainty due to reasoning like the above...)
Hm; are you saying you think FAI can probably be implemented without solving the logical uncertainty problem? My current visualization is that both the logical uncertainty problem and the safe rewrite problem will need to be solved—among others—and the reason I’ve been thinking about the rewrite problem is that using proof-based techniques, I at least had a grasp of how to think about it. (And my intuition has so far been that logical uncertainty will probably have diagonalization problems in a different guise when we actually have a concrete enough proposal to test this, so it seemed useful to think about the rewrite problem in the better-understood context, when trying to find in-principle solutions to the problems.)
Hm; are you saying you think FAI can probably be implemented without solving the logical uncertainty problem?
No, I was saying the opposite, and also hinting that working on and publishing results about logical uncertainty may be bad for AI risk because it helps AGI, not just FAI (whereas the AI reflection problem seems to be more specific to FAI). There’s also a discussion about this issue in the decision theory mailing list archives.
He’s talking about a probability distribution over the set of all bitstrings. You can’t assign p=0.5 (or any other constant real number) to each of them: that doesn’t sum to 1.0.
Right, so we want an AI design that does the right thing “naturally”, without a special case for “modify self”. It seems to me that UDT may already be such a design, if we had a solution to logical uncertainty (what I called “math intuition module”) to plug into it. When I wrote about logical uncertainty being important for AGI, I had in mind the problem of not being able to prove P!=NP and hence not being able to prove whether EU(search for polynomial solution to 3-SAT) < EU(not search for polynomial solution to 3-SAT). But if we had an AI that can handle that kind of decision problems, why wouldn’t it also be able to design and build another AI that uses a proof system that it can’t prove to be sound? Is there a reason to think that “AI reflection” isn’t just a special case of “logical uncertainty”?
(Of course UDT would probably build successors/copies of itself that also use “math intuition modules” rather than “proof systems” but we can imagine forcing a UDT-AI to choose between a selection of other AIs that are based on different proof systems, in which it would choose based on an analysis of the trade-off between power vs risk of inconsistency/unsoundness, like how a human would make the choice.)
I can’t visualize how saying that a result is “uncertain” could make the Lobian issue go away—do you have a concrete visualization for this, and if so, can you sketch it even if very briefly?
Possibly relevant: No Turing machine can assign finite probability to the halting sequence.
(Regardless of whether UDT1.1 is a correct quining solution to AI Reflection, we ultimately still need something that is not so “brute force”, so let’s continue this thread.)
I guess I’m saying that there’s not necessarily a Lob’s theorem or equivalent for a system that (does something like) assigns probabilities to mathematical statements instead of proving/disproving them from a fixed set of axioms. Do you think there is such an equivalent to Lob’s theorem, and if so what might it say?
I’m not seeing why this is relevant. Can you explain?
In the Q&A to the open problems talk, he said that if logical uncertainty is the solution to the problem, then he and Marcello didn’t figure out how, and they did look in that specific place.
I think that it would be useful to try to make some progress about how to reason probabilistically about logical statements before trying to use it as the solution to other problems, because my picture of how it could work is so unclear that I feel unable to draw any conclusions about what it can or cannot do with any kind of certainty. For example, I do not know a condition on a probability distribution over mathematical statements that would prevent the spurious proof problem—when I read about playing chicken with the universe, it sounded very familiar, but I can’t now see a way to apply it in the case of a probability distribution. (Then again, if we do expect to find a solution to the Löbian problems in that area, perhaps we should be working on logical uncertainty now, not on Löbian problems...)
If it weren’t for concerns about AI risk, I would be advocating working on logical uncertainty instead of Löbian problems, because it seems to me that an AI using “math intuition” may not face a Löbian problem to begin with, or if it does, the problem and/or solution may be different enough from that of an AI using a proof system that any work we do now will not be of much help.
(From the perspective of reducing AI risk, maybe it’s good to focus people’s attention on Löbian problems, unless it leads them to start thinking that they ought to work on logical uncertainty due to reasoning like the above...)
Hm; are you saying you think FAI can probably be implemented without solving the logical uncertainty problem? My current visualization is that both the logical uncertainty problem and the safe rewrite problem will need to be solved—among others—and the reason I’ve been thinking about the rewrite problem is that using proof-based techniques, I at least had a grasp of how to think about it. (And my intuition has so far been that logical uncertainty will probably have diagonalization problems in a different guise when we actually have a concrete enough proposal to test this, so it seemed useful to think about the rewrite problem in the better-understood context, when trying to find in-principle solutions to the problems.)
No, I was saying the opposite, and also hinting that working on and publishing results about logical uncertainty may be bad for AI risk because it helps AGI, not just FAI (whereas the AI reflection problem seems to be more specific to FAI). There’s also a discussion about this issue in the decision theory mailing list archives.
Can someone explain how an “always return 0.5” bot is not a counterexample?
He’s talking about a probability distribution over the set of all bitstrings. You can’t assign p=0.5 (or any other constant real number) to each of them: that doesn’t sum to 1.0.