I have not engaged much with your and Quintin’s recent arguments about how deep learning may change the basic arguments, so I want to acknowledge that I would probably shift my opinion a bunch in some direction if I did. Nonetheless, a few related points:
I do want to say that on-priors the level of anger and antagonism that appears on most internet comment sections is substantially higher than what happens when the people meet in-person, and do not suspect a corresponding about of active antagonism would happen if Nate or Eliezer or John Wentworth went to an ML conference. Perhaps stated more strongly: I think 99% of internet ‘hate’ is performative only.
You write “But it really really doesn’t help if you then look into their technical arguments and reach the conclusion that they don’t know what they are talking about.” I would respect any ML researchers making this claim more if they wrote a thoughtful rebuttal to AGI: A List of Lethalities (or really literally any substantive piece of Eliezer’s on the subject that they cared to — There’s No Fire Alarm, Security Mindset, Rocket Alignment, etc). I think Eliezer not knowing what he’s talking about would make rebutting him easier. As far as I’m aware literally zero significant ML researchers have written such a thing, Not Dario, not Demis, not Sutskever, not LeCun, nor basically anyone senior in their orgs. Eliezer has thought quite a lot and put forth some quite serious argument that seemed shockingly prescient to me, and I dunno, it seems maximally inconvenient for all the people earning multi-million-dollar annual salaries in this new field of ML to seriously to engage with a good-faith and prescient outsider with thoughtful arguments that their work risks extinction. If they’re dismissing him as “not getting it” yet don’t seriously engage with the arguments or make a positive case for how alignment can be solved, I think I ought to default to thinking of them as not morally serious in their statements. Relatedly I am pretty deeply disappointed by the speed at which intellectuals like Pinker and Cowen quickly come up with reasons to dismiss and avoid engaging with the arguments when the alternative is to seriously grapple with an extinction-level threat.
I am not compelled by the idea that if you haven’t restated your arguments to fit in with the new paradigm that’s shown up, then you must be out of the loop and wrong. Rather than “your arguments don’t seem perfectly suited to our new paradigm, look at all of these little holes I’ve found” I would be far more compelled by “here is a positive proposal for how to align a system that we build, with active reason to suspect it is aligned” or similar. Paul is the only person I know to propose specific algorithms for how to align systems and Eliezer has engaged seriously on Paul’s terms and found many holes in the proposal that Paul agreed with. I expect Eliezer would do the same if anyone working in the major labs did the same.
I understand that you and Quintin have criticisms (looking through bits of Quintin’s post it seems interesting, as do your claims here) as does Paul and others who all agree on the basics that this is an extinction-level threat, I think it is more productive for Eliezer to critique positive proposals than it is to update his arguments identifying the problem, especially when defending them from criticism from people who still think the extinction risk from misalignment is at least 5% and thus a top priority for civilization right now. If there was a leading ML practitioner arguing that ML was not an extinction-level threat and who was engaged with Eliezer’s arguments, I would consider it more worthwhile for Eliezer to respond. Meanwhile I think people working in alignment research should prefer to get on with the work at-hand, and that LessWrong is clearly the best forum to get engagement from people who understand what the problem is that is trying to actually be solved (and to find collaborators/funders/etc).
As far as I’m aware literally zero significant ML researchers have written such a thing, Not Dario, not Demis, not Sutskever, not LeCun, nor basically anyone senior in their orgs.
I just want to point out that seems like a ridiculous standard. Quintin’s recent critique is not that dissimilar to the one I would write (and I already have spent some time trying to point out the various flaws in the EY/MIRI world model), and I expect that you would get many of the same objections if you elicited a number of thoughtful DL researchers. But few if any have been motivated—what’s the point?
Here’s my critique in simplified form: the mainstream AI futurists (moravec,kurzweil,etc) predicted that AGI would be brain-like and thus close to a virtual brain emulation. Thus they were not so concerned about doom, because brain-like AGI seems like a more natural extension of humanity (moravec’s book is named ‘mind children’ for a reason), and an easier transition to manage.
In most ways that matter, Moravec/Kurzweil were correct, and EY was wrong. That really shouldn’t be even up for debate at this point. The approach that worked—DL—is essentially reverse engineering the brain. This is in part due to how the successful techniques all ended up being directly inspired by neuroscience and the now proven universal learning & scaling hypotheses[1] (deep and or recurrent ANNs in general, sparse coding, normalization, relus, etc) OR indirectly recapitulated neural circuitry (transformer ‘attention’ equivalence to fast weight memory, etc).
But in even simpler form: If you take a first already trained NN A and run it on a bunch of data and capture all its outputs, then train a second NN B on the input output dataset, the result is that B becomes a distilled copy—a distillation, of A.
This is in fact how we train large scale AI systems. They are trained on human thoughts.
The universal learning hypothesis is that the brain (and thus DL) uses simple universal learning algorithms, and all circuit content is learned automatically, which leads to the scaling hypothesis—intelligence comes from scaling up simple architectures and learning algorithms with massive compute, not continually explicitly “rewriting your source code” ala EY’s model.
I have not engaged much with your and Quintin’s recent arguments about how deep learning may change the basic arguments, so I want to acknowledge that I would probably shift my opinion a bunch in some direction if I did. Nonetheless, a few related points:
I do want to say that on-priors the level of anger and antagonism that appears on most internet comment sections is substantially higher than what happens when the people meet in-person, and do not suspect a corresponding about of active antagonism would happen if Nate or Eliezer or John Wentworth went to an ML conference. Perhaps stated more strongly: I think 99% of internet ‘hate’ is performative only.
You write “But it really really doesn’t help if you then look into their technical arguments and reach the conclusion that they don’t know what they are talking about.” I would respect any ML researchers making this claim more if they wrote a thoughtful rebuttal to AGI: A List of Lethalities (or really literally any substantive piece of Eliezer’s on the subject that they cared to — There’s No Fire Alarm, Security Mindset, Rocket Alignment, etc). I think Eliezer not knowing what he’s talking about would make rebutting him easier. As far as I’m aware literally zero significant ML researchers have written such a thing, Not Dario, not Demis, not Sutskever, not LeCun, nor basically anyone senior in their orgs. Eliezer has thought quite a lot and put forth some quite serious argument that seemed shockingly prescient to me, and I dunno, it seems maximally inconvenient for all the people earning multi-million-dollar annual salaries in this new field of ML to seriously to engage with a good-faith and prescient outsider with thoughtful arguments that their work risks extinction. If they’re dismissing him as “not getting it” yet don’t seriously engage with the arguments or make a positive case for how alignment can be solved, I think I ought to default to thinking of them as not morally serious in their statements. Relatedly I am pretty deeply disappointed by the speed at which intellectuals like Pinker and Cowen quickly come up with reasons to dismiss and avoid engaging with the arguments when the alternative is to seriously grapple with an extinction-level threat.
I am not compelled by the idea that if you haven’t restated your arguments to fit in with the new paradigm that’s shown up, then you must be out of the loop and wrong. Rather than “your arguments don’t seem perfectly suited to our new paradigm, look at all of these little holes I’ve found” I would be far more compelled by “here is a positive proposal for how to align a system that we build, with active reason to suspect it is aligned” or similar. Paul is the only person I know to propose specific algorithms for how to align systems and Eliezer has engaged seriously on Paul’s terms and found many holes in the proposal that Paul agreed with. I expect Eliezer would do the same if anyone working in the major labs did the same.
I understand that you and Quintin have criticisms (looking through bits of Quintin’s post it seems interesting, as do your claims here) as does Paul and others who all agree on the basics that this is an extinction-level threat, I think it is more productive for Eliezer to critique positive proposals than it is to update his arguments identifying the problem, especially when defending them from criticism from people who still think the extinction risk from misalignment is at least 5% and thus a top priority for civilization right now. If there was a leading ML practitioner arguing that ML was not an extinction-level threat and who was engaged with Eliezer’s arguments, I would consider it more worthwhile for Eliezer to respond. Meanwhile I think people working in alignment research should prefer to get on with the work at-hand, and that LessWrong is clearly the best forum to get engagement from people who understand what the problem is that is trying to actually be solved (and to find collaborators/funders/etc).
I just want to point out that seems like a ridiculous standard. Quintin’s recent critique is not that dissimilar to the one I would write (and I already have spent some time trying to point out the various flaws in the EY/MIRI world model), and I expect that you would get many of the same objections if you elicited a number of thoughtful DL researchers. But few if any have been motivated—what’s the point?
Here’s my critique in simplified form: the mainstream AI futurists (moravec,kurzweil,etc) predicted that AGI would be brain-like and thus close to a virtual brain emulation. Thus they were not so concerned about doom, because brain-like AGI seems like a more natural extension of humanity (moravec’s book is named ‘mind children’ for a reason), and an easier transition to manage.
In most ways that matter, Moravec/Kurzweil were correct, and EY was wrong. That really shouldn’t be even up for debate at this point. The approach that worked—DL—is essentially reverse engineering the brain. This is in part due to how the successful techniques all ended up being directly inspired by neuroscience and the now proven universal learning & scaling hypotheses[1] (deep and or recurrent ANNs in general, sparse coding, normalization, relus, etc) OR indirectly recapitulated neural circuitry (transformer ‘attention’ equivalence to fast weight memory, etc).
But in even simpler form: If you take a first already trained NN A and run it on a bunch of data and capture all its outputs, then train a second NN B on the input output dataset, the result is that B becomes a distilled copy—a distillation, of A.
This is in fact how we train large scale AI systems. They are trained on human thoughts.
The universal learning hypothesis is that the brain (and thus DL) uses simple universal learning algorithms, and all circuit content is learned automatically, which leads to the scaling hypothesis—intelligence comes from scaling up simple architectures and learning algorithms with massive compute, not continually explicitly “rewriting your source code” ala EY’s model.