I have only skimmed over a very few comments so this might be very wrong:
As far as I can tell, if an agent assumes that its methods are consistent then it might update on its past decisions as evidence on how to act correctly on future occasions. But while that line of reasoning misses the obvious fact that computationally bounded agents are fallible, given what we know an agent has to work under the assumption that its decision procedures are consistent. This leads to contradictions.
An agent build according to our current understanding of decision theory can’t be expected to self-modify to a correct decision theory (don’t ask me why humans can do this).
Anyway, my point in mentioning those problems (because they were listed as problems here or elsewhere) was to show that if we were to turn ourselves into the agents we desire to build, according our current laws of thought, then in some cases we would be worse off than we are now and in other cases like Pascal’s mugging we couldn’t be sure if we were going to make correct decisions (we are not sure if an agent with an unbounded finite utility function over outcomes is consistent).
The model uncertainty involved is still profound and we shouldn’t factor out human intuition at this point. Also, perceived absurdity does constitute evidence.
I have only skimmed over a very few comments so this might be very wrong:
As far as I can tell, if an agent assumes that its methods are consistent then it might update on its past decisions as evidence on how to act correctly on future occasions. But while that line of reasoning misses the obvious fact that computationally bounded agents are fallible, given what we know an agent has to work under the assumption that its decision procedures are consistent. This leads to contradictions.
An agent build according to our current understanding of decision theory can’t be expected to self-modify to a correct decision theory (don’t ask me why humans can do this).
Anyway, my point in mentioning those problems (because they were listed as problems here or elsewhere) was to show that if we were to turn ourselves into the agents we desire to build, according our current laws of thought, then in some cases we would be worse off than we are now and in other cases like Pascal’s mugging we couldn’t be sure if we were going to make correct decisions (we are not sure if an agent with an unbounded finite utility function over outcomes is consistent).
The model uncertainty involved is still profound and we shouldn’t factor out human intuition at this point. Also, perceived absurdity does constitute evidence.