My understanding is that for the most part SI prefers not to publish the results of their AI research, for reasons akin to those discussed here. However they have published on decision theory, presumably because it seems safer than publishing on other stuff and they’re interested in attracting people with technical chops to work on FAI:
I would guess EY sees himself as more of a researcher than a forecaster, so you shouldn’t be surprised if he doesn’t make as many predictions as Paul Krugman.
Also, here’s a quote from his paper on cognitive biases affecting judgment of global risks:
Once upon a time I made up overly detailed scenarios, without realizing that every additional detail was an extra burden. Once upon a time I really did think that I could say there was a ninety percent chance of Artificial Intelligence being developed between 2005 and 2025, with the peak in 2018. This statement now seems to me like complete gibberish. Why did I ever think I could generate a tight probability distribution over a problem like that? Where did I even get those numbers in the first place?
So he wasn’t born a rationalist. (I’ve been critical of him in the past, but I give him a lot of credit for realizing the importance of cognitive biases for what he was doing and popularizing them for such a wide audience.) My understanding was that one of the primary purposes of the sequences was to get people to realize the importance of cognitive biases at a younger age than he did.
Obviously I don’t speak for SI or Eliezer, so take this with a grain of salt.
I would guess EY sees himself as more of a researcher than a forecaster, so you shouldn’t be surprised if he doesn’t make as many predictions as Paul Krugman.
OK. If that is the case, then I think that a fair question to ask is what have his major achievements in research been?
But secondly, a lot of the discussion on LW and most of EY’s research presupposes certain things happening in the future. If AI is actually impossible, then trying to design a friendly AI is a waste of time (or, alternately, if AI won’t be developed for 10,000 years, then developing a friendly AI is not an urgent matter). What evidence can EY offer that he’s not wasting his time, to put it bluntly?
If AI is actually impossible, then trying to design a friendly AI is a waste of time
No, if our current evidence suggests that AI is impossible, and does so sufficiently strongly to outweigh the large downside of a negative singularity, then trying to design a freindly AI is a waste of time.
Even if it turns out that your house doesn’t burn down, buying insurance wasn’t necessarily a bad idea. What is important is how likely it looked beforehand, and the relative costs of the outcomes.
Claiming AI constructed in a world of physics is impossible is equivalent to saying intelligence in a world of physics is impossible. This would require humans to work by dualism.
Of course, this is entirely separate from feasibility.
If AI is actually impossible, then trying to design a friendly AI is a waste of time
I would think that anyone claiming that AI is impossible would have the burden pretty strongly on their shoulders. However, if one was instead saying that a fast-take off was impossible or extremely unlikely there would be more of a valid issue.
If his decision theory had a solid theoretical background, but turned out to be terrible when actually implemented, how would we know? Has there been any empirical testing of his theory?
You play a game that you could either win or lose. One person follows, so far as he or she is able, the tenets of timeless decision theory. Another person makes a decision by flipping a coin. The coin-flipper outperforms the TDTer.
I’m pretty confident that if we played an iterated prisoners dilemma, your flipping a coin each time and my running TDT, I would win. This is, however, quite a low bar.
Actually, I think the proper case is “Two players play a one-player game they can either win or lose. One person follows, so far as able, the tenets of TDT. The other decides by flipping a coin. The coin-flipper outperforms the TDTer.”
I mention this because lots of decision theories struggle against a coin flipping opponent: tit-for-tat is a strong IPD strategy that does poorly against a coin-flipper.
Is there any decision strategy that can do well (let’s define “well” as “better than always-defect”) against a coin-flipper in IPD? Any decision strategy more complex than always-defect requires the assumption that your opponent’s decisions can be at least predicted, if not influenced.
No, of course not. Against any opponent whose output has nothing to do with your previous plays (or expected plays, if they get a peak at your logic), one should clearly always defect.
Not if their probability of cooperation is so high that the expected value of cooperation remains higher than that of defecting. Or if their plays can be predicted, which satisfies your criterion (nothing to do with my previous plays) but not mine.
If someone defects every third time with no deviation, then I should defect whenever they defect. If they defect randomly one time in sixteen, I should always cooperate. (of course, always-cooperate is not more complex than always-defect.)
...I swear, this made sense when I did the numbers earlier today.
Permit me to substitute your question: TDT seems pretty neat philosophically, but can it actually be made to work as computer code?
Answer: Yes. (Sorry for the self-promotion, but I’m proud of myself for writing this up.) The only limiting factor right now is that nobody can program an efficient theorem-prover (or other equivalently powerful general reasoner), but that’s not an issue with decision theory per se. (In other words, if we could implement Causal Decision Theory well, then we could implement Timeless Decision Theory well.) But in any case, we can prove theorems about how TDT would do if equipped with a good theorem-prover.
My understanding is that for the most part SI prefers not to publish the results of their AI research, for reasons akin to those discussed here. However they have published on decision theory, presumably because it seems safer than publishing on other stuff and they’re interested in attracting people with technical chops to work on FAI:
http://singinst.org/blog/2010/11/12/timeless-decision-theory-paper-released/
I would guess EY sees himself as more of a researcher than a forecaster, so you shouldn’t be surprised if he doesn’t make as many predictions as Paul Krugman.
Also, here’s a quote from his paper on cognitive biases affecting judgment of global risks:
So he wasn’t born a rationalist. (I’ve been critical of him in the past, but I give him a lot of credit for realizing the importance of cognitive biases for what he was doing and popularizing them for such a wide audience.) My understanding was that one of the primary purposes of the sequences was to get people to realize the importance of cognitive biases at a younger age than he did.
Obviously I don’t speak for SI or Eliezer, so take this with a grain of salt.
OK. If that is the case, then I think that a fair question to ask is what have his major achievements in research been?
But secondly, a lot of the discussion on LW and most of EY’s research presupposes certain things happening in the future. If AI is actually impossible, then trying to design a friendly AI is a waste of time (or, alternately, if AI won’t be developed for 10,000 years, then developing a friendly AI is not an urgent matter). What evidence can EY offer that he’s not wasting his time, to put it bluntly?
No, if our current evidence suggests that AI is impossible, and does so sufficiently strongly to outweigh the large downside of a negative singularity, then trying to design a freindly AI is a waste of time.
Even if it turns out that your house doesn’t burn down, buying insurance wasn’t necessarily a bad idea. What is important is how likely it looked beforehand, and the relative costs of the outcomes.
Claiming AI constructed in a world of physics is impossible is equivalent to saying intelligence in a world of physics is impossible. This would require humans to work by dualism.
Of course, this is entirely separate from feasibility.
I would think that anyone claiming that AI is impossible would have the burden pretty strongly on their shoulders. However, if one was instead saying that a fast-take off was impossible or extremely unlikely there would be more of a valid issue.
If his decision theory had a solid theoretical background, but turned out to be terrible when actually implemented, how would we know? Has there been any empirical testing of his theory?
What does a decision theory that “has a solid theoretical background but turns out to be terrible” look like when implemented?
You play a game that you could either win or lose. One person follows, so far as he or she is able, the tenets of timeless decision theory. Another person makes a decision by flipping a coin. The coin-flipper outperforms the TDTer.
I’m pretty confident that if we played an iterated prisoners dilemma, your flipping a coin each time and my running TDT, I would win. This is, however, quite a low bar.
I took it as a statement of what would prove the theory false, rather than a statement of something believed likely.
I think that would only be true if there were more than 2 players. Won’t random coin flip and TDT be tied in the two player case?
No. TDT, once it figures out it’s facing a coin flipper, defects 100% of the time and runs away with it.
Nope, TDT will defect every time.
I’m not sure such a poor theory would survive having a solid theoretical background.
Actually, I think the proper case is “Two players play a one-player game they can either win or lose. One person follows, so far as able, the tenets of TDT. The other decides by flipping a coin. The coin-flipper outperforms the TDTer.”
I mention this because lots of decision theories struggle against a coin flipping opponent: tit-for-tat is a strong IPD strategy that does poorly against a coin-flipper.
Is there any decision strategy that can do well (let’s define “well” as “better than always-defect”) against a coin-flipper in IPD? Any decision strategy more complex than always-defect requires the assumption that your opponent’s decisions can be at least predicted, if not influenced.
No, of course not. Against any opponent whose output has nothing to do with your previous plays (or expected plays, if they get a peak at your logic), one should clearly always defect.
Not if their probability of cooperation is so high that the expected value of cooperation remains higher than that of defecting. Or if their plays can be predicted, which satisfies your criterion (nothing to do with my previous plays) but not mine.
If someone defects every third time with no deviation, then I should defect whenever they defect. If they defect randomly one time in sixteen, I should always cooperate. (of course, always-cooperate is not more complex than always-defect.)
...I swear, this made sense when I did the numbers earlier today.
Permit me to substitute your question: TDT seems pretty neat philosophically, but can it actually be made to work as computer code?
Answer: Yes. (Sorry for the self-promotion, but I’m proud of myself for writing this up.) The only limiting factor right now is that nobody can program an efficient theorem-prover (or other equivalently powerful general reasoner), but that’s not an issue with decision theory per se. (In other words, if we could implement Causal Decision Theory well, then we could implement Timeless Decision Theory well.) But in any case, we can prove theorems about how TDT would do if equipped with a good theorem-prover.