If you try to add to that category people who know that, but think that they are smart enough, then it gets tricky. How do I know whether I actually am smart enough, or whether I just think I’m smart enough.
Hm, not sure. Obviously on the object level you can just prove what the UDT agent will do. But not being able to do that is presumably why you’re uncertain in the first place.
Still, I think people should usually just trust themselves. “I don’t think I’m a rock, and a rock doesn’t think it’s a rock, but that doesn’t mean I might be a rock.”
I tried to solve it on my own, but haven’t been able to so far. I haven’t been able to figure out what sort of function someone who knows that I’m using UDT will use to predict my actions, and how my own decisions affect that. If someone knows that I’m using UDT, and I think that they think that I will cooperate with anyone who knows I’m using UDT, then I should break my word. But if they know that...
In general, I’m rather suspicious of the “trust yourself” argument. The Lake Wobegon effect would seem to indicate that humans don’t do it well.
And yeah, at some level you have to be checking for whether or not you are proving what the UDT agent will do—if you prove it you’re safe, and if you don’t you’re not. The trouble is that checking for the proof can contain all the steps of the proof, in which case you might get things wrong because your search wasn’t checking itself! So one way is to check for the proof in a way that doesn’t correlate with the specific proof. “Did I check any proofs? No? Better not trust anyone.”
If you try to add to that category people who know that, but think that they are smart enough, then it gets tricky. How do I know whether I actually am smart enough, or whether I just think I’m smart enough.
Hm, not sure. Obviously on the object level you can just prove what the UDT agent will do. But not being able to do that is presumably why you’re uncertain in the first place.
Still, I think people should usually just trust themselves. “I don’t think I’m a rock, and a rock doesn’t think it’s a rock, but that doesn’t mean I might be a rock.”
I tried to solve it on my own, but haven’t been able to so far. I haven’t been able to figure out what sort of function someone who knows that I’m using UDT will use to predict my actions, and how my own decisions affect that. If someone knows that I’m using UDT, and I think that they think that I will cooperate with anyone who knows I’m using UDT, then I should break my word. But if they know that...
In general, I’m rather suspicious of the “trust yourself” argument. The Lake Wobegon effect would seem to indicate that humans don’t do it well.
If you’re so smart, why ain’t you a rock? :P
And yeah, at some level you have to be checking for whether or not you are proving what the UDT agent will do—if you prove it you’re safe, and if you don’t you’re not. The trouble is that checking for the proof can contain all the steps of the proof, in which case you might get things wrong because your search wasn’t checking itself! So one way is to check for the proof in a way that doesn’t correlate with the specific proof. “Did I check any proofs? No? Better not trust anyone.”