In terms of proposing and discussing AI Alignment strategies, I feel like a few individuals have been dominating the LessWrong conversation recently.
I’ve seen a whole lot from John Wentworth and the Redwood team.
After that, it seems to get messier.
There are several individuals or small groups with their own very unique takes. Matthew Barnett, Davidad, Jesse Hoogland, etc. I think these groups often have very singular visions that they work on, that few others have much buy-in with.
Groups like the Deepmind and Anthropic safety teams seem hesitant to write much about or discuss big-picture strategy. My impression is that specific researchers are working typically working on fairly narrow agendas, and that the leaders of these orgs don’t have the most coherent strategies. There’s one big problem that it’s very difficult to be honest and interesting about big-picture AI strategy without saying things that would be bad for a major organization to say.
Most policy people seem focused on policy details. The funders (OP?) seem pretty quiet.
I think there’s occasionally some neat papers or posts that come from AI Policy or groups like Convergence research. But these also don’t seem to be a big part of the conversation I see—like the authors are pretty segmented, and other LessWrong readers and AI safety people don’t pay much attention to their work.
There are a lot of possible plans which I can imagine some group feasibly having which would meet one of the following criteria:
contains critical elements which are illegal
Contains critical elements which depends on an element of surprise / misdirection
Benefit from the actor bring first mover on the plan. Others can strategy copy, but can’t lead.
If one of these criteria or similar applies to the plan, then you can’t discuss it openly without sabotaging it. Making strategic plans with all your cards laid out on the table (whole open-ended hide theirs) makes things substantially harder.
I partially agree, but I think this must only be a small part of the issue.
- I think there’s a whole lot of key insights people could raise that aren’t info-hazards. - If secrecy were the main factor, I’d hope that there would be some access-controlled message boards or similar. I’d want the discussion to be intentionally happening somewhere. Right now I don’t really think that’s happening. I think a lot of tiny groups have their own personal ideas, but there’s surprisingly little systematic and private thinking between the power players. - I think that secrecy is often an excuse not to open ideas to feedback, and thus not be open to critique. Often, what what I see, this goes hand-in-hand with “our work just really isn’t that great, but we don’t want to admit it”
In the last 8 years or so, I’ve kept on hoping there would be some secret and brilliant “master plan” around EA, explaining the lack of public strategy. I have yet to find one. The closest I know of is some over-time discussion and slack threads with people at Constellation and similar—I think these are interesting in terms of understanding the perspectives of these (powerful) people, but I don’t get the impression that there’s all too much comprehensiveness of genius that’s being hidden.
That said, - I think that policy orgs need to be very secretive, so agree with you regarding why those orgs don’t write more big-picture things.
I don’t think you intended this implication, but I initially read “have been dominating” as negative-valenced!
Just want to say I’ve been really impressed and appreciative with the amount of public posts/discussion from those folks, and it’s encouraged me to do more of my own engagement because I’ve realized how helpful their comments/posts are to me (and so maybe mine likewise for some folks).
In terms of proposing and discussing AI Alignment strategies, I feel like a few individuals have been dominating the LessWrong conversation recently.
I’ve seen a whole lot from John Wentworth and the Redwood team.
After that, it seems to get messier.
There are several individuals or small groups with their own very unique takes. Matthew Barnett, Davidad, Jesse Hoogland, etc. I think these groups often have very singular visions that they work on, that few others have much buy-in with.
Groups like the Deepmind and Anthropic safety teams seem hesitant to write much about or discuss big-picture strategy. My impression is that specific researchers are working typically working on fairly narrow agendas, and that the leaders of these orgs don’t have the most coherent strategies. There’s one big problem that it’s very difficult to be honest and interesting about big-picture AI strategy without saying things that would be bad for a major organization to say.
Most policy people seem focused on policy details. The funders (OP?) seem pretty quiet.
I think there’s occasionally some neat papers or posts that come from AI Policy or groups like Convergence research. But these also don’t seem to be a big part of the conversation I see—like the authors are pretty segmented, and other LessWrong readers and AI safety people don’t pay much attention to their work.
There are a lot of possible plans which I can imagine some group feasibly having which would meet one of the following criteria:
contains critical elements which are illegal
Contains critical elements which depends on an element of surprise / misdirection
Benefit from the actor bring first mover on the plan. Others can strategy copy, but can’t lead.
If one of these criteria or similar applies to the plan, then you can’t discuss it openly without sabotaging it. Making strategic plans with all your cards laid out on the table (whole open-ended hide theirs) makes things substantially harder.
I partially agree, but I think this must only be a small part of the issue.
- I think there’s a whole lot of key insights people could raise that aren’t info-hazards.
- If secrecy were the main factor, I’d hope that there would be some access-controlled message boards or similar. I’d want the discussion to be intentionally happening somewhere. Right now I don’t really think that’s happening. I think a lot of tiny groups have their own personal ideas, but there’s surprisingly little systematic and private thinking between the power players.
- I think that secrecy is often an excuse not to open ideas to feedback, and thus not be open to critique. Often, what what I see, this goes hand-in-hand with “our work just really isn’t that great, but we don’t want to admit it”
In the last 8 years or so, I’ve kept on hoping there would be some secret and brilliant “master plan” around EA, explaining the lack of public strategy. I have yet to find one. The closest I know of is some over-time discussion and slack threads with people at Constellation and similar—I think these are interesting in terms of understanding the perspectives of these (powerful) people, but I don’t get the impression that there’s all too much comprehensiveness of genius that’s being hidden.
That said,
- I think that policy orgs need to be very secretive, so agree with you regarding why those orgs don’t write more big-picture things.
I don’t think you intended this implication, but I initially read “have been dominating” as negative-valenced!
Just want to say I’ve been really impressed and appreciative with the amount of public posts/discussion from those folks, and it’s encouraged me to do more of my own engagement because I’ve realized how helpful their comments/posts are to me (and so maybe mine likewise for some folks).
Correct, that wasn’t my intended point. Thanks for clarifying, I’ll try to be more careful in the future.