Moorean Statements
Moorean statements are statements like:
It’s not raining but I believe it is.
These statements sound strange because any agent that outright tells you it’s not raining must already at least tacitly represent that fact in their world model. That agent is plugged into their world model well enough to report what it says, but not well enough to accurately model their model. For this explicit a Moorean statement, the epistemic strangeness is so obvious that basically no one will have that combination of access to and confusion about their world model.
An Eliezerism my Eliezer-model often generates is that many social scripts involve expressing Moorean propositions. They’re subtler, but the essential confusion is the same.
I’m a committed Christian because my parents are—that’s just how I was raised.
Well, if intuitions aren’t epistemically admissible in philosophy, philosophers would be out of a job!
What? How can you simultaneously recognize the non-epistemic generator of your belief and hold the belief?
Can you generate more instances?
Who? I liked this post (I had heard of Moore’s paradox, but hadn’t thought about how it generalizes), but this unexplained reference is confusing. (The only famous person with that name I can find on Wikipedia is the Tamil mathematician C. J. Eliezer, but I can’t figure out why his work would be relevant in this context.)
If intuitions aren’t epistemically admissible anywhere, everyone is out of business, in the continued absence of an intuition-free epistemology.
“I’ve just won by two-boxing in Transparent Newcomb’s problem, but I don’t believe it actually happened.”
(Some weird epistemic states are useful to consider/allow.)
Note that humans are not well modeled as single agents, we are somewhat better described as a collection of agent-like thought patterns that are interacting but compartmentalized.
Let us suppose that there is pirate treasure on an island. I have a map to the treasure, which I inherited from my mother. You have a different map to the same treasure that you inherited from your mother. Our mothers are typical flawed humans. Our maps are typical flawed pirate maps.
Because I have the map I have, I believe that the treasure is over here. The non-epistemic generator of that belief is who my mother was. If I had a different mother I would have a different map and a different belief. Your map says the treasure is over there.
To find the treasure, I follow my map. An outsider notices that I am following a map that I know I am only following for non-epistemic reasons and that I have Moorean confusion. Perhaps so. But I cannot follow your map, because I don’t have it. So it’s best to follow my map.
If we shared our maps perhaps we could find the treasure more quickly and split it between us. But maybe it is hard to share the map. Maybe I don’t trust you not to defect. Maybe it is a zero-sum treasure. In pirate stories it is rarely so simple.
Similarly, Alice is a committed Christian and knows this is because she was raised to be Christian. If she was raised Muslim she would be a committed Muslim, and she knows this too. But her Christian “map” is really good and her Muslim “map” is a sketch from an hour long world religions class taught by a Confucian. It’s rational to continue to use her Christian map even if the evidence indicates that Islam has higher probability of truth.
I anticipate the reply that Alice can by all means follow her Christian map as long as it is the most useful map she had, but she should not mistake the map for the territory. This is an excellent thought. It is also thousands of years old and already part of Alice’s map.
Many of my beliefs have the non-epistemic generator “I was bored one afternoon (and started reading LessWrong)”. It is very easy to recognize the non-epistemic generator of my belief and also have it. My confusion is how anyone could not recognize the same thing.
I agree that the latter two examples have Moorean vibes, but I don’t think they strictly speaking can be classified as such (especially the last one). (Perhaps you are not saying this?) They could just be understood as instances of modus tollens, where the irrationality is not that they recognize that their belief has a non-epistemic generator, but rather that they have an absurdly high credence in ¬Q, i.e. “my parents wouldn’t be wrong” and “philosophers could/should not be out of jobs”.
“I know that they are a bad influence on me, but I still want to be with them.”
It’s totally valid statement, “I lose some value here in one way, but gain some value in another, and resulting sum is positive.”
I know that the analogy is not in any way precise, but… isn’t the whole Alignment problem, metaphorically, an attempt to resolve a Moorean statement?
“I know that the humans forced to smile are not happy (and I know all the mistakes they’ve made while programming me, I know what they should’ve done instead), but I don’t believe that they are not happy.”
Here’s an interesting bit from wikipedia:
https://en.wikipedia.org/wiki/Moore%27s_paradox#Proposed_explanations
Doesn’t “solving Alignment” mean creating some sort of “Transparency Condition”? Maybe such conditions are the key to having a human-like consciousness and ability to think about your own goals.
These are different senses of “happy.” It should really read:
They’re different concepts, so there’s no strangeness here. The AGI knows what you meant to do, it just cares about the different thing you accidently instilled in it, and so doesn’t care about what you wanted.
I know that there’s no strangeness from the formal point of view. But it doesn’t mean there’s no strangeness in general. Or that the situation isn’t similar to the Moore paradox. Your examples are not 100% Moore statements too. Isn’t the point of the discussion to find interesting connections between Moore paradox and other things?
I know that the classical way to formulate it is “AI knows, but doesn’t care”.
I thought it may be interesting to formulate it as “AI knows, but doesn’t believe”. It may be interesting to think for what type of AI this formulation may be true. For such AI alignment would mean resolving the Moore paradox. For example, imagine an AI with a very strong OCD to make people smile.