Now that we have a decent grounding of what Yudkowsky thinks deep knowledge is for, the biggest question is how to find it, and how to know you have found good deep knowledge.
This is basically the thing that bothered me about the debates. Your solution seems to be to analogize, Einstein:relativity::Yudkowsky:alignment is basically hopeless. But in the debates, M. Yudkowsky over and over says, “You can’t understand until you’ve done the homework, and I have, and you haven’t, and I can’t tell you what the homework is.” It’s a wall of text that can be reduced to, “Trust me.”
He might be right about alignment, but under the epistemic standards he popularized, if I update in the direction of his view, the strength of the update must be limited to “M. Yudkowsky was right about some of these things in the past and seems pretty smart and to have thought a lot about this stuff, but even Einstein was mistaken about spooky action at a distance, or maybe he was right and we haven’t figured it out yet, but, hey, quantum entanglement seems pretty real.” In many ways, science just is publishing the homework so people can poke holes in it.
If Einstein came to you in 1906 (after general relativity) and stated the conclusion of the special relativity paper, and when you asked him how he knew, he said, “You can’t understand until you’ve done the homework, and I have, and you haven’t,” which is all true from my experience studying the equations, “and I can’t tell you what the homework is,” the strength of your update would be similarly limited.
You might respond that M. Yudkowsky isn’t trying to really convince anyone, but in that case, why debate? He’s at least trying to get people to publish their AI findings less in order to burn less timeline.
This is basically the thing that bothered me about the debates. Your solution seems to be to analogize, Einstein:relativity::Yudkowsky:alignment is basically hopeless. But in the debates, M. Yudkowsky over and over says, “You can’t understand until you’ve done the homework, and I have, and you haven’t, and I can’t tell you what the homework is.” It’s a wall of text that can be reduced to, “Trust me.”
He might be right about alignment, but under the epistemic standards he popularized, if I update in the direction of his view, the strength of the update must be limited to “M. Yudkowsky was right about some of these things in the past and seems pretty smart and to have thought a lot about this stuff, but even Einstein was mistaken about spooky action at a distance, or maybe he was right and we haven’t figured it out yet, but, hey, quantum entanglement seems pretty real.” In many ways, science just is publishing the homework so people can poke holes in it.
I definitely feel you: that reaction was my big reason for taking so much time rereading his writing and penning this novel-length post.
The first thing I want to add is that after looking for discussions of this in the Sequences, they were there. So the uncharitable explanation of “he’s hiding the homework/explanation because he knows he’s wrong or doesn’t have enough evidence” doesn’t really work. (I don’t think you’re defending this, but it definitely crossed my mind and that of others I talked to). I honestly believe Yudowsky is saying in good faith that he has found deep knowledge and that he doesn’t know how to share it in a way he didn’t try in his 13 years of writing about them.
The second thing is that I feel my post brings together enough bits of Yudkowsky’s explanations of deep knowledge that we have at least a partial handle on how to check it? Quoting back my conclusion:
Yudkowsky sees deep knowledge as highly compressed causal explanations of “what sort of hypothesis ends up being right”. The compression means that we can rederive the successful hypotheses and theories from the causal explanation. Finally, such deep knowledge translates into partial constraints on hypothesis space, which focus the search by pointing out what cannot work.
So the check requires us to understand what sort of successful hypotheses he is compressing, if that is really a compression as a causal underlying process that can be used to rederive these hypotheses, and if the resulting constraint actually cuts a decent chunk of hypothesis space when applied to other problems.
That’s definitely a lot of work, and I can understand if people don’t want to invest the time there. But it seems different from me to have a potential check and be “I don’t think this is a good time investment” from saying that there’s no way to check the deep knowledge.
Lastly,
If Einstein came to you in 1906 (after general relativity) and stated the conclusion of the special relativity paper, and when you asked him how he knew, he said, “You can’t understand until you’ve done the homework, and I have, and you haven’t,” which is all true from my experience studying the equations, “and I can’t tell you what the homework is,” the strength of your update would be similarly limited.
I recommend reading Einstein’s Speed and Einstein’s Superpowers, which are the two posts where Yudkowsky tries to point out that if you look for it, it’s possible to find where Einstein was coming from and the sort of deep knowledge he used. I agree it would be easier if the person leveraging the deep knowledge could state it succintly enough that we could get it, but I also acknowledge that this sort of fundamental principle from which other thing derives are just plain hard to express. And even then, you need to do the homework.
(My disagreement with Yudkowsky here is that he seems to believe mostly in providing a lot of training data and examples so that people can see the deep knowledge for themselves, whereas I expect that most smart people would find it far easier to have a sort of pointer to the deep knowledge and what it is good for, and then go through a lot of examples).
This is basically the thing that bothered me about the debates. Your solution seems to be to analogize, Einstein:relativity::Yudkowsky:alignment is basically hopeless. But in the debates, M. Yudkowsky over and over says, “You can’t understand until you’ve done the homework, and I have, and you haven’t, and I can’t tell you what the homework is.” It’s a wall of text that can be reduced to, “Trust me.”
He might be right about alignment, but under the epistemic standards he popularized, if I update in the direction of his view, the strength of the update must be limited to “M. Yudkowsky was right about some of these things in the past and seems pretty smart and to have thought a lot about this stuff, but even Einstein was mistaken about spooky action at a distance, or maybe he was right and we haven’t figured it out yet, but, hey, quantum entanglement seems pretty real.” In many ways, science just is publishing the homework so people can poke holes in it.
If Einstein came to you in 1906 (after general relativity) and stated the conclusion of the special relativity paper, and when you asked him how he knew, he said, “You can’t understand until you’ve done the homework, and I have, and you haven’t,” which is all true from my experience studying the equations, “and I can’t tell you what the homework is,” the strength of your update would be similarly limited.
You might respond that M. Yudkowsky isn’t trying to really convince anyone, but in that case, why debate? He’s at least trying to get people to publish their AI findings less in order to burn less timeline.
I definitely feel you: that reaction was my big reason for taking so much time rereading his writing and penning this novel-length post.
The first thing I want to add is that after looking for discussions of this in the Sequences, they were there. So the uncharitable explanation of “he’s hiding the homework/explanation because he knows he’s wrong or doesn’t have enough evidence” doesn’t really work. (I don’t think you’re defending this, but it definitely crossed my mind and that of others I talked to). I honestly believe Yudowsky is saying in good faith that he has found deep knowledge and that he doesn’t know how to share it in a way he didn’t try in his 13 years of writing about them.
The second thing is that I feel my post brings together enough bits of Yudkowsky’s explanations of deep knowledge that we have at least a partial handle on how to check it? Quoting back my conclusion:
So the check requires us to understand what sort of successful hypotheses he is compressing, if that is really a compression as a causal underlying process that can be used to rederive these hypotheses, and if the resulting constraint actually cuts a decent chunk of hypothesis space when applied to other problems.
That’s definitely a lot of work, and I can understand if people don’t want to invest the time there. But it seems different from me to have a potential check and be “I don’t think this is a good time investment” from saying that there’s no way to check the deep knowledge.
Lastly,
I recommend reading Einstein’s Speed and Einstein’s Superpowers, which are the two posts where Yudkowsky tries to point out that if you look for it, it’s possible to find where Einstein was coming from and the sort of deep knowledge he used. I agree it would be easier if the person leveraging the deep knowledge could state it succintly enough that we could get it, but I also acknowledge that this sort of fundamental principle from which other thing derives are just plain hard to express. And even then, you need to do the homework.
(My disagreement with Yudkowsky here is that he seems to believe mostly in providing a lot of training data and examples so that people can see the deep knowledge for themselves, whereas I expect that most smart people would find it far easier to have a sort of pointer to the deep knowledge and what it is good for, and then go through a lot of examples).