elifland comments on (My understanding of) What Everyone in Technical Alignment is Doing and Why

elifland 31 Aug 2022 14:44 UTC
2 points
0
“strongly influences the organization that builds AGI” applies to all alignment research initiatives right? Alignment researchers at e.g. DeepMind have less of an uphill battle but they still have to convince the rest of DeepMind to adopt their work.
Yes, I didn’t mean to imply this was necessarily an Ought-specific problem and I guess it may have been a bit unfair for me to only do a BOTEC on Ought. I included it because I had the most fleshed-out thoughts on it but it could give the wrong impression about relative promise when others don’t have BOTECs. Also people (not implying you!) often take my BOTECs too seriously, they’re done in this spirit.
That being said, I agree that strong within-organization influence feels more likely than across; not sure to what extent.
- Vika 13 Sep 2022 16:38 UTC
  9 points
  7
  Parent
  I would expect that the way Ought (or any other alignment team) influences the AGI-building org is by influencing the alignment team within that org, which would in turn try to influence the leadership of the org. I think the latter step in this chain is the bottleneck—across-organization influence between alignment teams is easier than within-organization influence. So if we estimate that Ought can influence other alignment teams with 50% probability, and the DM / OpenAI / etc alignment team can influence the corresponding org with 20% probability, then the overall probability of Ought influencing the org that builds AGI is 10%. Your estimate of 1% seems too low to me unless you are a lot more pessimistic about alignment researchers influencing their organization from the inside.
  - elifland 14 Sep 2022 4:31 UTC
    3 points
    0
    Parent
    Good point, and you definitely have more expertise on the subject than I do. I think my updated view is ~5% on this step.
    I might be underconfident about my pessimism on the first step (competitiveness of process-based systems) though. Overall I’ve updated to be slightly more optimistic about this route to impact.
- jungofthewon 1 Sep 2022 18:47 UTC
  1 point
  0
  Parent
  All good, thanks for clarifying.