All these mistakes look more like failures of rationality to me, because smart people make them too.
It took Eliezer to realize that this assumption is not always valid
At least some of the credit goes to Hofstadter. In any case, I think people listened to Eliezer more because he said things like “I have worked out a mathematical analysis of these confusing problems”, not just “my intuition says the basic assumptions of game theory don’t sound right”. If you do explore the implications of your alternative set of assumptions and they turn out to be interesting, you’re exempt from mistake #1.
You observe a correlation between less arguing over assumptions and more interesting discussions/results, but isn’t it possible that both are caused by higher intelligence
Intelligence is certainly a common factor, but I also observe that correlation by looking at myself at different times. If I get more interesting results when I view problems on their own terms, that strategy might work for other people too.
The difference between Hofstadter and Eliezer is that Hofstadter couldn’t make a convincing enough case for his assumptions, because he was talking about humans instead of AIs, and it’s just not clear that human decision procedures are similar enough to each other for his assumptions to hold. Eliezer also thought his ideas applied to humans, but he had a backup argument to the effect “even if you don’t think this applies to humans, at least it applies to AIs who know each others’ source code, so it’s still important to to work on” and that’s what convinced me.
BTW, for historical interest, I found a 2002 post by Hal Finney that came pretty close to some of the ideas behind TDT:
I have a problem with this application of game theory to a situation where
A and B both know that they are going to choose the same thing, which I
believe is the case here. [...]
[...] They are two instances
of the same deterministic calculation, with exactly the same steps being
executed for both.
[...] And the best of the two possible outcomes is when
both parties cooperate rather than defect.
I responded to Hal, and stated my agreement, but neither of us followed it up at the time. I even forgot about the post until I found it again yesterday, but I guess it must have influenced my thinking once Eliezer started talking about similar ideas.
In any case, I think people listened to Eliezer more because he said things like “I have worked out a mathematical analysis of these confusing problems”, not just “my intuition says the basic assumptions of game theory don’t sound right”.
Personally, I thought he made a good case that the basic assumptions of game theory aren’t right, or rather won’t be right in a future where superintelligent AIs know each others’ source code. I don’t think I would have been particularly interested if he just said “these non-standard assumptions lead to some cool math” since I don’t have that much interest in math qua math.
Similarly, I explore other seemingly strange assumptions like the ones in Newcomb’s Problem or Counterfactual Mugging because I think they are abstracted/simplified versions of real problems in FAI design and ethics, designed to isolate and clarify some particular difficulties, not because they are “interesting when taken on its own terms”.
I guess it appears to you that you are working on these problems because they seem like interesting math, or “interesting when taken on its own terms”, but I wonder why you find these particular math problems or assumptions interesting, and not the countless others you could choose instead. Maybe the part of your brain that outputs “interesting” is subconsciously evaluating importance and relevance?
I guess it appears to you that you are working on these problems because they seem like interesting math, or “interesting when taken on its own terms”, but I wonder why you find these particular math problems or assumptions interesting, and not the countless others you could choose instead. Maybe the part of your brain that outputs “interesting” is subconsciously evaluating importance and relevance?
An even more likely explanation is that my mind evaluates reputation gained per unit of effort. Academic math is really crowded, chances are that no one would read my papers anyway. Being in a frustratingly informal field with a lot of pent-up demand for formality allows me to get many people interested in my posts while my mathematician friends get zero feedback on their publications. Of course it didn’t feel so cynical from the inside, it felt more like a growing interest fueled by constant encouragement from the community. If “Re-formalizing PD” had met with a cold reception, I don’t think I’d be doing this now.
In that case you’re essentially outsourcing your “interestingness” evaluation to the SIAI/LW community, and I think we are basing it mostly on relevance to FAI.
Yeah. Though that doesn’t make me adopt FAI as my own primary motivation, just like enjoying sex doesn’t make me adopt genetic fitness as my primary motivation.
My point is that your advice isn’t appropriate for everyone. People who do care about FAI or other goals besides community approval should think/argue about assumptions. Of course one could overdo that and waste too much time, but they clearly can’t just work on whatever problems seem likely to offer the largest social reward per unit of effort.
Though that doesn’t make me adopt FAI as my own primary motivation
What if we rewarded you for adopting FAI as your primary motivation? :)
No, I mean what if we offered you rewards for changing your terminal goals so that you’d continue to be motivated by FAI even after the rewards end? You should take that deal if we can offer big enough rewards and your discount rate is high enough, right? Previous related thread
You’re trying to affect the motivation of a decision theory researcher by offering a transaction whose acceptance is itself a tricky decision theory problem?
Upvoted for hilarious metaness.
Now, all we need to do is figure out how humans can modify their own source code and verify those modifications in others...
All these mistakes look more like failures of rationality to me, because smart people make them too.
At least some of the credit goes to Hofstadter. In any case, I think people listened to Eliezer more because he said things like “I have worked out a mathematical analysis of these confusing problems”, not just “my intuition says the basic assumptions of game theory don’t sound right”. If you do explore the implications of your alternative set of assumptions and they turn out to be interesting, you’re exempt from mistake #1.
Intelligence is certainly a common factor, but I also observe that correlation by looking at myself at different times. If I get more interesting results when I view problems on their own terms, that strategy might work for other people too.
The difference between Hofstadter and Eliezer is that Hofstadter couldn’t make a convincing enough case for his assumptions, because he was talking about humans instead of AIs, and it’s just not clear that human decision procedures are similar enough to each other for his assumptions to hold. Eliezer also thought his ideas applied to humans, but he had a backup argument to the effect “even if you don’t think this applies to humans, at least it applies to AIs who know each others’ source code, so it’s still important to to work on” and that’s what convinced me.
BTW, for historical interest, I found a 2002 post by Hal Finney that came pretty close to some of the ideas behind TDT:
I responded to Hal, and stated my agreement, but neither of us followed it up at the time. I even forgot about the post until I found it again yesterday, but I guess it must have influenced my thinking once Eliezer started talking about similar ideas.
Personally, I thought he made a good case that the basic assumptions of game theory aren’t right, or rather won’t be right in a future where superintelligent AIs know each others’ source code. I don’t think I would have been particularly interested if he just said “these non-standard assumptions lead to some cool math” since I don’t have that much interest in math qua math.
Similarly, I explore other seemingly strange assumptions like the ones in Newcomb’s Problem or Counterfactual Mugging because I think they are abstracted/simplified versions of real problems in FAI design and ethics, designed to isolate and clarify some particular difficulties, not because they are “interesting when taken on its own terms”.
I guess it appears to you that you are working on these problems because they seem like interesting math, or “interesting when taken on its own terms”, but I wonder why you find these particular math problems or assumptions interesting, and not the countless others you could choose instead. Maybe the part of your brain that outputs “interesting” is subconsciously evaluating importance and relevance?
An even more likely explanation is that my mind evaluates reputation gained per unit of effort. Academic math is really crowded, chances are that no one would read my papers anyway. Being in a frustratingly informal field with a lot of pent-up demand for formality allows me to get many people interested in my posts while my mathematician friends get zero feedback on their publications. Of course it didn’t feel so cynical from the inside, it felt more like a growing interest fueled by constant encouragement from the community. If “Re-formalizing PD” had met with a cold reception, I don’t think I’d be doing this now.
In that case you’re essentially outsourcing your “interestingness” evaluation to the SIAI/LW community, and I think we are basing it mostly on relevance to FAI.
Yeah. Though that doesn’t make me adopt FAI as my own primary motivation, just like enjoying sex doesn’t make me adopt genetic fitness as my primary motivation.
My point is that your advice isn’t appropriate for everyone. People who do care about FAI or other goals besides community approval should think/argue about assumptions. Of course one could overdo that and waste too much time, but they clearly can’t just work on whatever problems seem likely to offer the largest social reward per unit of effort.
What if we rewarded you for adopting FAI as your primary motivation? :)
That sounds sideways. Wouldn’t that make the reward my primary motivation? =)
No, I mean what if we offered you rewards for changing your terminal goals so that you’d continue to be motivated by FAI even after the rewards end? You should take that deal if we can offer big enough rewards and your discount rate is high enough, right? Previous related thread
You’re trying to affect the motivation of a decision theory researcher by offering a transaction whose acceptance is itself a tricky decision theory problem?
Upvoted for hilarious metaness.
Now, all we need to do is figure out how humans can modify their own source code and verify those modifications in others...
That could work, but how would that affect my behavior? We don’t seem to have any viable mathematical attacks on FAI-related matters except this one.
I suggest editing the post to include this point.