Here’s a possibly more relevant variant of the question: we human beings don’t have access to our own source code via quining, so how are we supposed to make decisions?
My thoughts on this so far are that we need to develop a method of mapping an external description of an mathematical object to what it feels like from the inside. Then we can say that the consequences of “me choosing option A” is the logical consequences of all objects with the same subjective experiences/memories as me choosing option A.
I think the quining trick may just be a stopgap solution, and the full solution even for AIs will need to involve something like the above. That’s one possibility that I’m thinking about.
I guess the ability to find oneself in the environment depends on no strange things happening in the environments you care about (which are “probable”), so that you can simply pattern-match the things that qualify as you, starting with an extremely simple idea of what “you” is, in some sense inbuilt is our minds by evolution. But in general, if there are tons of almost-you running around, you need the exact specification to figure out which of them you actually control.
This is basically the same idea I want to use to automatically extract human preference, even though strictly speaking one already needs full preference to recognize its instance.
Then we can say that the consequences of “me choosing option A” is the logical consequences of all objects with the same subjective experiences/memories as me choosing option A.
Not exactly… Me and Bob may have identical experiences/memories, but I have a hidden rootkit installed that makes me defect, and Bob doesn’t.
Maybe it would make more sense to inspect the line of reasoning that leads to my defection, and ask a) would this line of reasoning likely occur to Bob? b) would he find it overwhelmingly convincing? This is kinda like quining, because the line of reasoning must refer to a copy of itself in Bob’s mind.
This just considers the “line of reasoning” as a whole agent (part of the original agent, but not controlled by the original agent), and again assumes perfect self-knowledge by that “line of reasoning” sub-agent (supplied by the bigger agent, perhaps, but taken on faith by the sub-agent).
Here’s a possibly more relevant variant of the question: we human beings don’t have access to our own source code via quining, so how are we supposed to make decisions?
My thoughts on this so far are that we need to develop a method of mapping an external description of an mathematical object to what it feels like from the inside. Then we can say that the consequences of “me choosing option A” is the logical consequences of all objects with the same subjective experiences/memories as me choosing option A.
I think the quining trick may just be a stopgap solution, and the full solution even for AIs will need to involve something like the above. That’s one possibility that I’m thinking about.
I guess the ability to find oneself in the environment depends on no strange things happening in the environments you care about (which are “probable”), so that you can simply pattern-match the things that qualify as you, starting with an extremely simple idea of what “you” is, in some sense inbuilt is our minds by evolution. But in general, if there are tons of almost-you running around, you need the exact specification to figure out which of them you actually control.
This is basically the same idea I want to use to automatically extract human preference, even though strictly speaking one already needs full preference to recognize its instance.
Not exactly… Me and Bob may have identical experiences/memories, but I have a hidden rootkit installed that makes me defect, and Bob doesn’t.
Maybe it would make more sense to inspect the line of reasoning that leads to my defection, and ask a) would this line of reasoning likely occur to Bob? b) would he find it overwhelmingly convincing? This is kinda like quining, because the line of reasoning must refer to a copy of itself in Bob’s mind.
This just considers the “line of reasoning” as a whole agent (part of the original agent, but not controlled by the original agent), and again assumes perfect self-knowledge by that “line of reasoning” sub-agent (supplied by the bigger agent, perhaps, but taken on faith by the sub-agent).