I’m nervous about reposting stuff from the workshop list as top-level posts on LW. I’m a pretty minor figure there and it might be seen as grabbing credit for a communal achievement. Yeah, this specific formalization is my idea, which builds on Nesov’s idea (ambient control), which builds on Wei Dai’s idea (UDT), which builds on Eliezer’s idea (TDT). If the others aren’t reposting for whatever reason, I don’t want to go against the implied norm.
(The recent post about Löbian cooperation wasn’t intended for the workshop, but for some reason the discussion there was way more intelligent than here on LW. So I kinda moved there with my math exercises.)
If the others aren’t reposting for whatever reason, I don’t want to go against the implied norm.
It is much more likely that people aren’t posting because they haven’t thought of it or can’t be bothered. I too would like to see top-level posts on this topic. And I wouldn’t worry about grabbing credit; as long as you putting attributions or links in the expected places, you’re fine.
For a bit of background regarding priority from my point of view: the whole idea of ADT was “controlling the logical consequences by deciding which premise to make true”, which I then saw to also have been the idea behind UDT (maybe implicitly, Wei never commented on that). Later in the summer I shifted towards thinking about general logical theories, instead of specifically equivalence of programs, as in UDT.
However, as of July, there were two outstanding problems. First, it was unclear what kinds of things are possible to prove from a premise that the agent does X, and so how feasible brute force theories of consequences were as a model of this sort of decision algorithms. Your post showed that in a certain situation it is indeed possible to prove enough to make decisions using only this “let’s try to prove what follows” principle.
Second, maybe more importantly, it was very much unclear in what way one should state (the axioms of) a possible decision. There were three candidates to my mind: (1) try to state a possible decision in a weaker way, so that the possible decisions that aren’t actual don’t produce inconsistent theories, (2) try to ground the concept (theory) of a possible decision in the concept of reality, where the agent was built in the first place, which would serve as a specific guideline for fulfilling (1); and (3) try to live with inconsistency. The last option seemed less and less doable, the first option depended on rather arbitrary choices, and the second is frustratingly hairy.
However, in a thread on decision-theory-workshop, your comments prompted me to make the observation that consequences always appear consistent, that one can’t prove absurdity from any possible action, even though consequences are actually inconsistent (which you’ve reposted in the comment above). This raises the chances for option (3), dealing with inconsistency, although it’s still unclear what’s going on.
Thus, your input substantially helped with both problems. I’m not overly enthused with the results only because they are still very much incomplete.
I’m nervous about reposting stuff from the workshop list as top-level posts on LW. I’m a pretty minor figure there and it might be seen as grabbing credit for a communal achievement. Yeah, this specific formalization is my idea, which builds on Nesov’s idea (ambient control), which builds on Wei Dai’s idea (UDT), which builds on Eliezer’s idea (TDT). If the others aren’t reposting for whatever reason, I don’t want to go against the implied norm.
(The recent post about Löbian cooperation wasn’t intended for the workshop, but for some reason the discussion there was way more intelligent than here on LW. So I kinda moved there with my math exercises.)
It is much more likely that people aren’t posting because they haven’t thought of it or can’t be bothered. I too would like to see top-level posts on this topic. And I wouldn’t worry about grabbing credit; as long as you putting attributions or links in the expected places, you’re fine.
Sorry for deleting my comment. I still have some unarticulated doubts, will think more.
For a bit of background regarding priority from my point of view: the whole idea of ADT was “controlling the logical consequences by deciding which premise to make true”, which I then saw to also have been the idea behind UDT (maybe implicitly, Wei never commented on that). Later in the summer I shifted towards thinking about general logical theories, instead of specifically equivalence of programs, as in UDT.
However, as of July, there were two outstanding problems. First, it was unclear what kinds of things are possible to prove from a premise that the agent does X, and so how feasible brute force theories of consequences were as a model of this sort of decision algorithms. Your post showed that in a certain situation it is indeed possible to prove enough to make decisions using only this “let’s try to prove what follows” principle.
Second, maybe more importantly, it was very much unclear in what way one should state (the axioms of) a possible decision. There were three candidates to my mind: (1) try to state a possible decision in a weaker way, so that the possible decisions that aren’t actual don’t produce inconsistent theories, (2) try to ground the concept (theory) of a possible decision in the concept of reality, where the agent was built in the first place, which would serve as a specific guideline for fulfilling (1); and (3) try to live with inconsistency. The last option seemed less and less doable, the first option depended on rather arbitrary choices, and the second is frustratingly hairy.
However, in a thread on decision-theory-workshop, your comments prompted me to make the observation that consequences always appear consistent, that one can’t prove absurdity from any possible action, even though consequences are actually inconsistent (which you’ve reposted in the comment above). This raises the chances for option (3), dealing with inconsistency, although it’s still unclear what’s going on.
Thus, your input substantially helped with both problems. I’m not overly enthused with the results only because they are still very much incomplete.