and Ought either builds AGI or strongly influences the organization that builds AGI.
“strongly influences the organization that builds AGI” applies to all alignment research initiatives right? Alignment researchers at e.g. DeepMind have less of an uphill battle but they still have to convince the rest of DeepMind to adopt their work.
“strongly influences the organization that builds AGI” applies to all alignment research initiatives right? Alignment researchers at e.g. DeepMind have less of an uphill battle but they still have to convince the rest of DeepMind to adopt their work.
Yes, I didn’t mean to imply this was necessarily an Ought-specific problem and I guess it may have been a bit unfair for me to only do a BOTEC on Ought. I included it because I had the most fleshed-out thoughts on it but it could give the wrong impression about relative promise when others don’t have BOTECs. Also people (not implying you!) often take my BOTECs too seriously, they’re done in this spirit.
That being said, I agree that strong within-organization influence feels more likely than across; not sure to what extent.
I would expect that the way Ought (or any other alignment team) influences the AGI-building org is by influencing the alignment team within that org, which would in turn try to influence the leadership of the org. I think the latter step in this chain is the bottleneck—across-organization influence between alignment teams is easier than within-organization influence. So if we estimate that Ought can influence other alignment teams with 50% probability, and the DM / OpenAI / etc alignment team can influence the corresponding org with 20% probability, then the overall probability of Ought influencing the org that builds AGI is 10%. Your estimate of 1% seems too low to me unless you are a lot more pessimistic about alignment researchers influencing their organization from the inside.
Good point, and you definitely have more expertise on the subject than I do. I think my updated view is ~5% on this step.
I might be underconfident about my pessimism on the first step (competitiveness of process-based systems) though. Overall I’ve updated to be slightly more optimistic about this route to impact.
“strongly influences the organization that builds AGI” applies to all alignment research initiatives right? Alignment researchers at e.g. DeepMind have less of an uphill battle but they still have to convince the rest of DeepMind to adopt their work.
Yes, I didn’t mean to imply this was necessarily an Ought-specific problem and I guess it may have been a bit unfair for me to only do a BOTEC on Ought. I included it because I had the most fleshed-out thoughts on it but it could give the wrong impression about relative promise when others don’t have BOTECs. Also people (not implying you!) often take my BOTECs too seriously, they’re done in this spirit.
That being said, I agree that strong within-organization influence feels more likely than across; not sure to what extent.
I would expect that the way Ought (or any other alignment team) influences the AGI-building org is by influencing the alignment team within that org, which would in turn try to influence the leadership of the org. I think the latter step in this chain is the bottleneck—across-organization influence between alignment teams is easier than within-organization influence. So if we estimate that Ought can influence other alignment teams with 50% probability, and the DM / OpenAI / etc alignment team can influence the corresponding org with 20% probability, then the overall probability of Ought influencing the org that builds AGI is 10%. Your estimate of 1% seems too low to me unless you are a lot more pessimistic about alignment researchers influencing their organization from the inside.
Good point, and you definitely have more expertise on the subject than I do. I think my updated view is ~5% on this step.
I might be underconfident about my pessimism on the first step (competitiveness of process-based systems) though. Overall I’ve updated to be slightly more optimistic about this route to impact.
All good, thanks for clarifying.