When I trace the dependencies of common alignment beliefs and claims, a lot of them come back to e.g. RFLO and other ideas put forward by the MIRI cluster. Since I often find myself arguing against common alignment claims, I often argue against the historical causes of those ideas, which involves arguing against MIRI-takes.
I’m personally satisfied that their concerns are (generally) not worth worrying about. However, often people in my social circles are not. And such beliefs will probably have real-world consequences for governance.
Neargroup—I have a few friends who work at MIRI, and debate them on alignment ideas pretty often. I also sometimes work near MIRI people.
Because I disagree with them very sharply, their claims bother me more and are rendered more salient.
I feel bothered about MIRI still (AFAICT) getting so much funding/attention (even though it’s relatively lower than it used to be), because it seems to me that since e.g. 2016 they have released ~zero technical research that helps us align AI in the present or in the future. It’s been five years since they stopped disclosing any of their research, and it seems like no one else really cares anymore. That bothers me.
As to why I haven’t responded to e.g. your concerns in detail:
I currently don’t put much value on marginal theoretical research (even in shard theory, which I think is quite a bit better than other kinds of theory).
I feel less hopeful about LessWrong debate doing much, as I have described elsewhere. It feels like a better use of my time to put my head down, read a bunch of papers, and do good empirical work at GDM.
I am generally worn out of arguing about theory on the website, and have been since last December. (I will note that I have enjoyed our interactions and appreciated your contributions.)
Sounds like to the extent that you do have time/energy for theory, you might want to strategically reallocate your attention a bit? I get that you think a bunch of people are wrong and you’re worried about the consequences of that, but diminishing returns is a thing, and you could be too certain yourself (that MIRI concerns are definitely wrong).
And then empirical versus theory, how much do you worry about architectural changes obsoleting your empirical work? I noticed for example that in image generation GAN was recently replaced by latent diffusion, which probably made a lot of efforts to “control” GAN-based image generation useless.
That aside, “heads down empirical work” only makes sense if you picked a good general direction before putting your head down. Should it not worry people that shard theory researchers do not seem to have engaged with (or better yet, preemptively addressed) basic concerns/objections about their approach?
I’ll answer this descriptively.
When I trace the dependencies of common alignment beliefs and claims, a lot of them come back to e.g. RFLO and other ideas put forward by the MIRI cluster. Since I often find myself arguing against common alignment claims, I often argue against the historical causes of those ideas, which involves arguing against MIRI-takes.
I’m personally satisfied that their concerns are (generally) not worth worrying about. However, often people in my social circles are not. And such beliefs will probably have real-world consequences for governance.
Neargroup—I have a few friends who work at MIRI, and debate them on alignment ideas pretty often. I also sometimes work near MIRI people.
Because I disagree with them very sharply, their claims bother me more and are rendered more salient.
I feel bothered about MIRI still (AFAICT) getting so much funding/attention (even though it’s relatively lower than it used to be), because it seems to me that since e.g. 2016 they have released ~zero technical research that helps us align AI in the present or in the future. It’s been five years since they stopped disclosing any of their research, and it seems like no one else really cares anymore. That bothers me.
As to why I haven’t responded to e.g. your concerns in detail:
I currently don’t put much value on marginal theoretical research (even in shard theory, which I think is quite a bit better than other kinds of theory).
I feel less hopeful about LessWrong debate doing much, as I have described elsewhere. It feels like a better use of my time to put my head down, read a bunch of papers, and do good empirical work at GDM.
I am generally worn out of arguing about theory on the website, and have been since last December. (I will note that I have enjoyed our interactions and appreciated your contributions.)
Sounds like to the extent that you do have time/energy for theory, you might want to strategically reallocate your attention a bit? I get that you think a bunch of people are wrong and you’re worried about the consequences of that, but diminishing returns is a thing, and you could be too certain yourself (that MIRI concerns are definitely wrong).
And then empirical versus theory, how much do you worry about architectural changes obsoleting your empirical work? I noticed for example that in image generation GAN was recently replaced by latent diffusion, which probably made a lot of efforts to “control” GAN-based image generation useless.
That aside, “heads down empirical work” only makes sense if you picked a good general direction before putting your head down. Should it not worry people that shard theory researchers do not seem to have engaged with (or better yet, preemptively addressed) basic concerns/objections about their approach?