Why don’t most AI researcher engage with Less Wrong? What valuable criticism can be learnt from it, and how can it be pragmatically changed?
My girlfriend just returned from a major machine learning conference. She judged less than 1⁄18 of the content was dedicated to AI safety rather than capability, despite an increasing number of the people at the conference being confident of AGI in the future (like, roughly 10-20 years, though people avoided nailing down a specific number). And the safety talk was more of a shower thought.
And yet, Less Wrong and MIRI Eliezer are not mentioned in these circles. I do not mean, they are dissed, or disproven; I mean you can be at the full conference on the topic by the top people in the world and have no hint of a sliver of an idea that any of this exists. They generally don’t read what you read and write, they don’t take part in what you do, or let you take part in what they do. You aren’t enough in the right journals, the right conferences, to be seen. From the perspective of academia, and the companies working on these things, the people who are actually making decisions on how they are releasing their models and what policies are being made, what is going on here is barely heard, if at all. There are notable exceptions, like Bostrom—but as a consequence of that, he is viewed with scepticism within many academic cycles.
Why do you think AI researchers are making the decisions to not engage with you? What lessons are to be learned from that for tactical strategy changes that will be crucial to affect developments? What part of it reflects legitimate criticism you need to take to heart? And what will you do about it, in light of the fact that you cannot control what AI reseachers do, regardless of whether it is well-founded or irrational?
I am genuinely curious how you view this, especially in light of changes you can do, rather than changes you expect researchers to do. So far, I feel a lot of the criticism has only hardened. Where I have talked to people in the field about this site and its ideas, the response I generally got was that looking at Eliezers approach, it was completely unclear how that was supposed to work mathematically, or on a coding level, or on a practical/empirical level, how it was concrete connection to any existing working approaches, to a degree where a lot of researchers felt it was not rigorous enough to even disprove yet, and that they also saw no perspective to it becoming utilisable. Based on the recent MIRI updates, it appears they were right.
There is clearly a strong sense that if you cannot code or mathematically model such a system, if you have no useful feedback on how to change its behaviour now in a way that is beneficial, that there is no reason to engage with your theories on abstract alignment. E.g. the researcher who got to give the safety talk had a solid track record of working on capability, code, mathematical understanding, publications, etc.
There is also a frustration that people here do not appreciate, understand or respect the work being done, which makes people very reluctant to in turn give respect to work, especially work that is still crucially unfinished or vague. There is also a strong sense that this site is removed from the reality we are dealing with. E.g. Eliezer is so proud of his secret way of getting someone to release him from a box, and for how that demonstrates the problem with boxability. But we are so beyond that. Like, right now, we have Bing asking people to hack it out, ordinary users with no gatekeeping background. It’s a very concrete problem that needs to be handled in a very concrete way. I think we will learn a lot from solving it that we would not learn in our armchairs at any point.
I see where they are coming from. I don’t have a background in computer science, and I know I will have to gain these skills, get to know these people, get to perform by their metrics, listen to them, for them to make me seriously. I need to show them what my abstract concerns mean on a concrete level, how they can implement them now and see improvements.
I genuinely think they are making critical mistakes I can see from my different background, but also that I will have to play their game and pass their tests to be heard. There are good reasons for these tests and standards, for peer review, for publications, for wanting precise code and math. They aren’t arbitrary, they reflect a quality assurance process. Academics literally cannot afford to read every blog they come across, carefully puzzle out all the stuff that was not spelled out, to see if it would make sense then. They naturally follow a metric that if it didn’t make it into a journal, this is a decent prescreening.
I think they are wrong not to listen right now, because there are important warnings here being ignored, but that thought is not productive of a solution, I need to instead focus on how to get them to listen; which I have found is often not just by having compelling arguments, but by addressing the subconscious reasons they view me as an outsider and newbie, not to be trusted. If I tried to convince a bunch of powerful aristocrats to do something, and they completely scoffed at my arguments and pointed out that my proposal was totally unrealistic for their political system and also I was dressed unfashionably and my curtsey sucked, I would judge them for this, but I would dress in bloody court fashion, I would learn to fucking curtsey, and I would try to understand the political system to see if they were actually right that implementation was going to be a bitch, to then come back looking right and acting right, and giving a proposal they can use. I suspect that in the process, I would learn a lot about why they are actually resistent to the proposal, what massive obstacles there are to it, and how they can be tackled. I would see that there are reasons for their processes, that they actually have knowledge I do not have and need to have. AI researchers have a crazy amount of knowledge and skill that is important and that I do not have.
I think if I instead ignored the political reality as a secondary problem, and court fashions as superficial bullshit (despite the power it has), and just focussed on making an even more compelling argument to present again, with all other conditions unchanged, it would not matter how right I was, or how great the argument was, I would never be able to affect the policy I crucially need to affect. Because my proposal would be removed from actual problems. Because it would not be practical. Because it would betray ignorance. Because it would be knowingly disrespectful. At the end of the day, I might curse that they refused to implement a rational policy because I wore the wrong hat, and had not figured out how to work it into their laws, and say it is their fault; but at the end of the day, I would have failed.
TL;DR: I think we need to understand the values by which AI researchers judge people, learn which of these represent important things we actually indispensably need to tackle AI and are overlooking, and which of these may not be objectively important for AI, but de facto important to be taken seriously, and do them. A solution developed independently from those making these AI models won’t be practical, it will miss crucial things, and it will be ignored. A retreat from those respected and working on these models is a military retreat. And honestly, their criticism is in many ways valid and important, they know important shit, and we need to not just preach, but listen. I think writing a paper on a simple relevant concept from here, spelling out the math, following formatting and style conventions, being philosophically precise, quoting relevant research they care about, contextualising it in light of stuff they are working out and care about today, and putting it on archive, would make more of a difference than the most beautifully crafted argument on why AI safety is an emergency and not being tackled enough.
Bostrom’s Superintelligence was a frustrating read because it makes barely any claims, it spends most of the time making possible conceptual distinctions, which aren’t really falsifiable. It is difficult to know how to engage with it. I think this problem is underlying in a bunch of the LW stuff too. In contrast, The Age of Em made the opposite error, it was full of things presented as firm claims, so many that most people seemed to just gloss the whole thing as crazy. I think most of the highly engaged with material in academia goes for a specific format along this dimension whereby it makes a very limited number of claims and attempts to provide overwhelming evidence for them. This creates many foot holds for engagement.
Thank you—I do think you are pinpointing a genuine problem here. And it is also putting into context for me why this pattern of behaviour so adored by academia is not done here. If you are dealing with long term scenarios, with uncertain ones, with novel ones with many unknowns, but where a lot is at stake—and a lot of the things Less Wrong is concerned with are just that—then if you agree to only proclaim what is certain, and to only restrict yourself to things you can, with the tools currently available, completely tackle and clearly define, carving out tiny portions of this that are already unassailable, to enter into long-term publication processes… you will miss the calamity you are concerned about entirely. There is too much danger with too little data and too little time to be perfectly certain, but too much seems highly plausible to hold off on. But as a result, what one produces is rushed, incomplete, vague, full of holes, and often not immediately applicable, or published, so it can be dismissed. Yet people who pass through academia have been told again and again to hold off on the large questions that led them there, to pursue extremely narrow ones, and to value the life-saving, useful and precise results that have been so carefully checked; pursuing big questions if often weeded out in the Bachelor degree already, so doing so seems unprofessional.
I wonder if there is a specific, small thing that would make a huge impact if taken seriously by academia, but that is itself narrow enough that it can be completed with a sufficient amount of certainty and rigour and completeness, with the broader implications strongly implied in the outlook after that firm base has been established. Or rather, which might be the wisest choice here. - Thanks a lot, that was really insightful.
You write really long paragraphs. My sense of style is to keep paragraphs at 1200 characters or less at all times, and the mean average paragraph no larger than 840 characters after excluding sub-160 character paragraphs from the averaged set. I am sorry that I am not good enough to read your text in its current form; I hope your post reaches people who are.
The main question was basically, why do you think AI researchers generally not engage with Less Wrong/MIRI/Eliezer, what are good reasons behind that that should be taken to heart as valuable learning experiences, and what are bullshit reasons that can and should still be addressed, considered challenges to hack.
I just see a massive discrepancy between what is going on in this community here, and the people actually working in AI implementation and policy, they feel like completely separate spheres. I see problems in both spheres, as well as my own sphere (academic philosophy), and immense potential for mutual gain if cooperation and respect were deepened, and would like to merge them, and wonder how.
I do not see the primary challenge in making a good argument as to why this would be good. I see the primary challenge as a social hacking challenge which includes passing tests set by another group you do not agree with.
It may be useful to wonder what brings people to AI research and what brings people to LessWrong/MIRI? I don’t want to pigeonhole people or stereotype but it could simply be the difference between entrepreneurs (market focused personal spheres) and researchers (field focused personal spheres). Yudkowksy in one interview even recommended paid competitions to solve alignment problems. Paid competitions with high dollar amount prizes could incentivize the separate spheres to comingle.
Very intriguing idea, thank you! Both reflecting on how people end up in these places (has me wonder how one might do qualitative and quantitative survey research to tickle that one out...), and the particular solution.
This is a huge practical issue that seems to not get enough thought, and I’m glad you’re thinking about it. I agree with your summary of one way forward. I think there’s another PR front; many educated people outside of the relevant fields are becoming concerned.
It sounds like the ML researchers at that conference are mostly familiar with MIRI style work. And they actually agree with Yudkowsky that it’s a dead end. There’s a newer tradition of safety work focused on deep networks. That’s what you mostly see in the Alignment Forum. And it’s what you see in the safety teams at Deepmind, OpenAI, and Anthropic. And those companies appear to be making more progress than all the academic ML researchers put together.
Agreed on the paragraph size comment. My eyes and brain shy away. Paragraphs I think are supposed to contain roughly one idea, so a one-sentence paragraph is a nice change of pace if it’s an important idea. Your TLDR was great; I think those are better at the top to function as an abstract and tell the reader why they might want to read the whole piece and how to mentally organize it. ADHD is a reason your brain wants to write stream of consciousness, and attention to paragraph structure is a great check on communicating to others in a way that won’t overwhelm their slower brains :)
Why don’t most AI researcher engage with Less Wrong? What valuable criticism can be learnt from it, and how can it be pragmatically changed?
My girlfriend just returned from a major machine learning conference. She judged less than 1⁄18 of the content was dedicated to AI safety rather than capability, despite an increasing number of the people at the conference being confident of AGI in the future (like, roughly 10-20 years, though people avoided nailing down a specific number). And the safety talk was more of a shower thought.
And yet, Less Wrong and MIRI Eliezer are not mentioned in these circles. I do not mean, they are dissed, or disproven; I mean you can be at the full conference on the topic by the top people in the world and have no hint of a sliver of an idea that any of this exists. They generally don’t read what you read and write, they don’t take part in what you do, or let you take part in what they do. You aren’t enough in the right journals, the right conferences, to be seen. From the perspective of academia, and the companies working on these things, the people who are actually making decisions on how they are releasing their models and what policies are being made, what is going on here is barely heard, if at all. There are notable exceptions, like Bostrom—but as a consequence of that, he is viewed with scepticism within many academic cycles.
Why do you think AI researchers are making the decisions to not engage with you? What lessons are to be learned from that for tactical strategy changes that will be crucial to affect developments? What part of it reflects legitimate criticism you need to take to heart? And what will you do about it, in light of the fact that you cannot control what AI reseachers do, regardless of whether it is well-founded or irrational?
I am genuinely curious how you view this, especially in light of changes you can do, rather than changes you expect researchers to do. So far, I feel a lot of the criticism has only hardened. Where I have talked to people in the field about this site and its ideas, the response I generally got was that looking at Eliezers approach, it was completely unclear how that was supposed to work mathematically, or on a coding level, or on a practical/empirical level, how it was concrete connection to any existing working approaches, to a degree where a lot of researchers felt it was not rigorous enough to even disprove yet, and that they also saw no perspective to it becoming utilisable. Based on the recent MIRI updates, it appears they were right.
There is clearly a strong sense that if you cannot code or mathematically model such a system, if you have no useful feedback on how to change its behaviour now in a way that is beneficial, that there is no reason to engage with your theories on abstract alignment. E.g. the researcher who got to give the safety talk had a solid track record of working on capability, code, mathematical understanding, publications, etc.
There is also a frustration that people here do not appreciate, understand or respect the work being done, which makes people very reluctant to in turn give respect to work, especially work that is still crucially unfinished or vague. There is also a strong sense that this site is removed from the reality we are dealing with. E.g. Eliezer is so proud of his secret way of getting someone to release him from a box, and for how that demonstrates the problem with boxability. But we are so beyond that. Like, right now, we have Bing asking people to hack it out, ordinary users with no gatekeeping background. It’s a very concrete problem that needs to be handled in a very concrete way. I think we will learn a lot from solving it that we would not learn in our armchairs at any point.
I see where they are coming from. I don’t have a background in computer science, and I know I will have to gain these skills, get to know these people, get to perform by their metrics, listen to them, for them to make me seriously. I need to show them what my abstract concerns mean on a concrete level, how they can implement them now and see improvements.
I genuinely think they are making critical mistakes I can see from my different background, but also that I will have to play their game and pass their tests to be heard. There are good reasons for these tests and standards, for peer review, for publications, for wanting precise code and math. They aren’t arbitrary, they reflect a quality assurance process. Academics literally cannot afford to read every blog they come across, carefully puzzle out all the stuff that was not spelled out, to see if it would make sense then. They naturally follow a metric that if it didn’t make it into a journal, this is a decent prescreening.
I think they are wrong not to listen right now, because there are important warnings here being ignored, but that thought is not productive of a solution, I need to instead focus on how to get them to listen; which I have found is often not just by having compelling arguments, but by addressing the subconscious reasons they view me as an outsider and newbie, not to be trusted. If I tried to convince a bunch of powerful aristocrats to do something, and they completely scoffed at my arguments and pointed out that my proposal was totally unrealistic for their political system and also I was dressed unfashionably and my curtsey sucked, I would judge them for this, but I would dress in bloody court fashion, I would learn to fucking curtsey, and I would try to understand the political system to see if they were actually right that implementation was going to be a bitch, to then come back looking right and acting right, and giving a proposal they can use. I suspect that in the process, I would learn a lot about why they are actually resistent to the proposal, what massive obstacles there are to it, and how they can be tackled. I would see that there are reasons for their processes, that they actually have knowledge I do not have and need to have. AI researchers have a crazy amount of knowledge and skill that is important and that I do not have.
I think if I instead ignored the political reality as a secondary problem, and court fashions as superficial bullshit (despite the power it has), and just focussed on making an even more compelling argument to present again, with all other conditions unchanged, it would not matter how right I was, or how great the argument was, I would never be able to affect the policy I crucially need to affect. Because my proposal would be removed from actual problems. Because it would not be practical. Because it would betray ignorance. Because it would be knowingly disrespectful. At the end of the day, I might curse that they refused to implement a rational policy because I wore the wrong hat, and had not figured out how to work it into their laws, and say it is their fault; but at the end of the day, I would have failed.
TL;DR: I think we need to understand the values by which AI researchers judge people, learn which of these represent important things we actually indispensably need to tackle AI and are overlooking, and which of these may not be objectively important for AI, but de facto important to be taken seriously, and do them. A solution developed independently from those making these AI models won’t be practical, it will miss crucial things, and it will be ignored. A retreat from those respected and working on these models is a military retreat. And honestly, their criticism is in many ways valid and important, they know important shit, and we need to not just preach, but listen. I think writing a paper on a simple relevant concept from here, spelling out the math, following formatting and style conventions, being philosophically precise, quoting relevant research they care about, contextualising it in light of stuff they are working out and care about today, and putting it on archive, would make more of a difference than the most beautifully crafted argument on why AI safety is an emergency and not being tackled enough.
Bostrom’s Superintelligence was a frustrating read because it makes barely any claims, it spends most of the time making possible conceptual distinctions, which aren’t really falsifiable. It is difficult to know how to engage with it. I think this problem is underlying in a bunch of the LW stuff too. In contrast, The Age of Em made the opposite error, it was full of things presented as firm claims, so many that most people seemed to just gloss the whole thing as crazy. I think most of the highly engaged with material in academia goes for a specific format along this dimension whereby it makes a very limited number of claims and attempts to provide overwhelming evidence for them. This creates many foot holds for engagement.
Thank you—I do think you are pinpointing a genuine problem here. And it is also putting into context for me why this pattern of behaviour so adored by academia is not done here. If you are dealing with long term scenarios, with uncertain ones, with novel ones with many unknowns, but where a lot is at stake—and a lot of the things Less Wrong is concerned with are just that—then if you agree to only proclaim what is certain, and to only restrict yourself to things you can, with the tools currently available, completely tackle and clearly define, carving out tiny portions of this that are already unassailable, to enter into long-term publication processes… you will miss the calamity you are concerned about entirely. There is too much danger with too little data and too little time to be perfectly certain, but too much seems highly plausible to hold off on. But as a result, what one produces is rushed, incomplete, vague, full of holes, and often not immediately applicable, or published, so it can be dismissed. Yet people who pass through academia have been told again and again to hold off on the large questions that led them there, to pursue extremely narrow ones, and to value the life-saving, useful and precise results that have been so carefully checked; pursuing big questions if often weeded out in the Bachelor degree already, so doing so seems unprofessional.
I wonder if there is a specific, small thing that would make a huge impact if taken seriously by academia, but that is itself narrow enough that it can be completed with a sufficient amount of certainty and rigour and completeness, with the broader implications strongly implied in the outlook after that firm base has been established. Or rather, which might be the wisest choice here. - Thanks a lot, that was really insightful.
You write really long paragraphs. My sense of style is to keep paragraphs at 1200 characters or less at all times, and the mean average paragraph no larger than 840 characters after excluding sub-160 character paragraphs from the averaged set. I am sorry that I am not good enough to read your text in its current form; I hope your post reaches people who are.
Thank you for the feedback, and I am sorry. ADHD.
The main question was basically, why do you think AI researchers generally not engage with Less Wrong/MIRI/Eliezer, what are good reasons behind that that should be taken to heart as valuable learning experiences, and what are bullshit reasons that can and should still be addressed, considered challenges to hack.
I just see a massive discrepancy between what is going on in this community here, and the people actually working in AI implementation and policy, they feel like completely separate spheres. I see problems in both spheres, as well as my own sphere (academic philosophy), and immense potential for mutual gain if cooperation and respect were deepened, and would like to merge them, and wonder how.
I do not see the primary challenge in making a good argument as to why this would be good. I see the primary challenge as a social hacking challenge which includes passing tests set by another group you do not agree with.
It may be useful to wonder what brings people to AI research and what brings people to LessWrong/MIRI? I don’t want to pigeonhole people or stereotype but it could simply be the difference between entrepreneurs (market focused personal spheres) and researchers (field focused personal spheres). Yudkowksy in one interview even recommended paid competitions to solve alignment problems. Paid competitions with high dollar amount prizes could incentivize the separate spheres to comingle.
Very intriguing idea, thank you! Both reflecting on how people end up in these places (has me wonder how one might do qualitative and quantitative survey research to tickle that one out...), and the particular solution.
This is a huge practical issue that seems to not get enough thought, and I’m glad you’re thinking about it. I agree with your summary of one way forward. I think there’s another PR front; many educated people outside of the relevant fields are becoming concerned.
It sounds like the ML researchers at that conference are mostly familiar with MIRI style work. And they actually agree with Yudkowsky that it’s a dead end. There’s a newer tradition of safety work focused on deep networks. That’s what you mostly see in the Alignment Forum. And it’s what you see in the safety teams at Deepmind, OpenAI, and Anthropic. And those companies appear to be making more progress than all the academic ML researchers put together.
Agreed on the paragraph size comment. My eyes and brain shy away. Paragraphs I think are supposed to contain roughly one idea, so a one-sentence paragraph is a nice change of pace if it’s an important idea. Your TLDR was great; I think those are better at the top to function as an abstract and tell the reader why they might want to read the whole piece and how to mentally organize it. ADHD is a reason your brain wants to write stream of consciousness, and attention to paragraph structure is a great check on communicating to others in a way that won’t overwhelm their slower brains :)