I think that the release of GPT-3 and the OpenAI API led to significantly increased focus and somewhat of a competitive spirit around large language models… I don’t think OpenAI predicted this in advance, and believe that it would have been challenging, but not impossible, to foresee this.
Do you believe any general lessons have been learned from this? Specifically, it seems a highly negative pattern if [we can’t predict concretely how this is likely to go badly] translates to [we don’t see any reason not to go ahead].
I note that there’s an asymmetry here: [states of the world we like] are a small target. To the extent that we can’t predict the impact of a large-scale change, we should bet on negative impact.
OpenAI leadership tend to put more likelihood on slow takeoff, are more optimistic about the possibility of solving alignment, especially via empirical methods that rely on capabilities, and are more concerned about bad actors developing and misusing AGI...
Questions:
If we’re in a scenario with [slow takeoff], [alignment is fairly easy], and [empirical, capabilities-reliant approaches work well], wouldn’t we expect alignment to get solved by default without OpenAI? Why is this the scenario to focus on?
If the concern is with bad actors, and OpenAI is serious about avoiding race conditions, why not enact the merge-and-assist clause now? Why not join forces with DeepMind now? Would this be negative? If so, why? Would it simply be impractical? Then on what basis would we expect it to be practical when it matters?
OpenAI’s particular research directions are driven in large part by researchers
If an organisation’s research efforts are largely driven by researchers, then the key question becomes: what is the organisation doing to ensure the creation/selection of the right kinds of alignment researchers? (the actually-likely-to-help-solve-the-problem kind)
If there were talented enough researchers who wanted to lead new alignment efforts at OpenAI, I would expect them to be enthusiastically welcomed by OpenAI leadership.
To the extent that the alignment problem is hard this seems negative: it’s possible to be extremely talented, yet heading in a predictably wrong direction—orthogonality isn’t just for AIs. In order to be net-useful, an organisation would need to select strongly for [is likely to head in an effective direction].
DM has to deal with Alphabet management, who is significantly less alignment-aware than DM or OAI leadership. Merging wouldn’t solve the race dynamics and would make ownership/leadership issues worse.
Sure, that makes sense to me. I suppose my main point is “why would we expect this to be different in the future?”. (perhaps there are reasons to think things would be different, but I’ve heard no argument to this effect)
Thanks again for writing this.
A few thoughts:
Do you believe any general lessons have been learned from this? Specifically, it seems a highly negative pattern if [we can’t predict concretely how this is likely to go badly] translates to [we don’t see any reason not to go ahead].
I note that there’s an asymmetry here: [states of the world we like] are a small target. To the extent that we can’t predict the impact of a large-scale change, we should bet on negative impact.
Questions:
If we’re in a scenario with [slow takeoff], [alignment is fairly easy], and [empirical, capabilities-reliant approaches work well], wouldn’t we expect alignment to get solved by default without OpenAI? Why is this the scenario to focus on?
If the concern is with bad actors, and OpenAI is serious about avoiding race conditions, why not enact the merge-and-assist clause now? Why not join forces with DeepMind now? Would this be negative? If so, why? Would it simply be impractical? Then on what basis would we expect it to be practical when it matters?
If an organisation’s research efforts are largely driven by researchers, then the key question becomes: what is the organisation doing to ensure the creation/selection of the right kinds of alignment researchers? (the actually-likely-to-help-solve-the-problem kind)
To the extent that the alignment problem is hard this seems negative: it’s possible to be extremely talented, yet heading in a predictably wrong direction—orthogonality isn’t just for AIs. In order to be net-useful, an organisation would need to select strongly for [is likely to head in an effective direction].
DM has to deal with Alphabet management, who is significantly less alignment-aware than DM or OAI leadership. Merging wouldn’t solve the race dynamics and would make ownership/leadership issues worse.
Sure, that makes sense to me. I suppose my main point is “why would we expect this to be different in the future?”. (perhaps there are reasons to think things would be different, but I’ve heard no argument to this effect)