Mikhail Samin comments on LDT (and everything else) can be irrational

Mikhail Samin Nov 7, 2024, 3:56 PM
3 points
0
It’s reasonable to consider two agents playing against each other. “Playing against your copy” is a reasonable problem. ($9 rocks get 0 in this problem, LDTs probably get $5.)

Newcomb, Parfit’s hitchhiker, smoking, etc. are all very reasonable problems that essentially depend on the buttons you press when you play the game. It is important to get these problems right.

But playing against LDT is not necessarily in the “fair problem class” because the game might behave differently depending on your algorithm/on how you arrive at taking actions, and not just depending on your actions.

Your version of it- playing against an LDT- is indeed different from playing against a game that looks at whether we’re an alphabetizing agent and pick X instead of Y because X<Y and not because we looked at the expected utility: we would want LDT to perform optimally in this game. But the reason LDT-created-rock loses to a natural rock here isn’t fundamentally different from the reason LDT loses to an alphabetizing agent in the other game and it is known that you can construct a game like that where LDT will lose to something else. You can make the game description sound more natural, but I feel like there’s a sharp divide between the “fair problem class” problems and others.

(I also think that in real life, where this game might play out, there isn’t really a choice we can make, to make our AI a $9 rock instead of an LDT agent; because when we do that due to the rock’s better performance in this game, our rock gets slightly less than $5 in EV instead of getting $9; LDT doesn’t perform worse than other agents we could’ve chosen in this game.)