I don’t think the work done by such researchers is the main problem: the main problem is that once a very large proportion of the field is fake!alignment, new people coming to work on AIS may disproportionately be introduced to fake!alignment.
We might reasonably hope that the wisest and smartest new people may see fake!alignment for what it is, and work on the real problems regardless. However, I expect that there are many people with the potential to do positive work, who’d do positive work if they were exposed to [MIRI], but not if exposed to [MIRI + 10,000 fake!alignmenters]. [EDIT: here I don’t mean to imply that all non-MIRI researchers are doing fake!alignment; this is just a hypothetical comparison for illustration where you’re free to pick your own criteria for fake!alignment]
This isn’t obviously inevitable, but it does seem the default outcome.
Valid point, though I’m not sure the original post mentioned that.
Counterpoint: would that actually change the absolute number of real!alignment researchers? If the probability that a given inductee would do real!alignment goes down, but the number of inductees goes way up + the timelines get longer, it’d still be a net-positive intervention.
That’s true given a fixed proportion of high-potential researchers amongst inductees—but I wouldn’t expect that. The more we go out and recruit people who’re disproportionately unlikely to understand the true nature of the problem (i.e. likely candidates for “worse than doing nothing”), the more the proportion of high-potential inductees drops. [also I don’t think there’s much “timelines get longer” here]
Obviously it’s far from clear how it’d work out in practice; this may only be an issue with taking the most naïve approach. I do think it’s worth worrying about—particularly given that there aren’t clean takebacks.
I don’t mean to argue against expanding the field—but I do think it’s important to put a lot of thought into how best to do it.
I don’t think the work done by such researchers is the main problem: the main problem is that once a very large proportion of the field is fake!alignment, new people coming to work on AIS may disproportionately be introduced to fake!alignment.
We might reasonably hope that the wisest and smartest new people may see fake!alignment for what it is, and work on the real problems regardless. However, I expect that there are many people with the potential to do positive work, who’d do positive work if they were exposed to [MIRI], but not if exposed to [MIRI + 10,000 fake!alignmenters]. [EDIT: here I don’t mean to imply that all non-MIRI researchers are doing fake!alignment; this is just a hypothetical comparison for illustration where you’re free to pick your own criteria for fake!alignment]
This isn’t obviously inevitable, but it does seem the default outcome.
Valid point, though I’m not sure the original post mentioned that.
Counterpoint: would that actually change the absolute number of real!alignment researchers? If the probability that a given inductee would do real!alignment goes down, but the number of inductees goes way up + the timelines get longer, it’d still be a net-positive intervention.
That’s true given a fixed proportion of high-potential researchers amongst inductees—but I wouldn’t expect that.
The more we go out and recruit people who’re disproportionately unlikely to understand the true nature of the problem (i.e. likely candidates for “worse than doing nothing”), the more the proportion of high-potential inductees drops. [also I don’t think there’s much “timelines get longer” here]
Obviously it’s far from clear how it’d work out in practice; this may only be an issue with taking the most naïve approach. I do think it’s worth worrying about—particularly given that there aren’t clean takebacks.
I don’t mean to argue against expanding the field—but I do think it’s important to put a lot of thought into how best to do it.