I don’t think I’m concerned by moving up a level in abstraction. For one, I don’t expect any specific developer to suddenly get access to 5 − 9 OOMs more compute than any previous developer. For another, it seems clear that we’d want the AIs being built to be misaligned with whatever “values” correspond to the outer selection signals associated with the outer optimizer in question (i.e., “the people doing the best on benchmarks will get their approaches copied, get more funding, etc”). Seems like an AI being aligned to, like, impressing its developers? doing well on benchmarks? getting more funding? becoming the best architecture it can be? IDK what, but it would probably be bad.
So, I don’t see a reason to expect either a sudden capabilities jump (Edit: deriving from the same mechanism as the human sharp left turn), or (undesirable) misalignment.
(2) Is it plausible that one group’s training run will have importantly new and different capabilities from the best relevant previous one? (I say yes—consider grokking, or algorithmic improvements, or autonomous learning per my other comment.)
And then you wrote:
I don’t expect any specific developer to suddenly get access to 5 − 9 OOMs more compute than any previous developer.
Isn’t that kinda a strawman? I can imagine a lot of scenarios where a training run results in a qualitatively better trained model than any that came before—I mentioned three of them—and I think “5-9OOM more compute than any previous developer” is a much much less plausible scenario than any of the three I mentioned.
This post mainly argues that evolution does not provide evidence for the sharp left turn. Sudden capabilities jumps from other sources, such as those you mention, are more likely, IMO. My first reply to your comment is arguing that the mechanisms behind the human sharp left turn wrt evolution probably still won’t arise in AI development, even if you go up an abstraction level. One of those mechanisms is a 5 − 9 OOM jump in usable optimization power, which I think is unlikely.
I don’t think I’m concerned by moving up a level in abstraction. For one, I don’t expect any specific developer to suddenly get access to 5 − 9 OOMs more compute than any previous developer. For another, it seems clear that we’d want the AIs being built to be misaligned with whatever “values” correspond to the outer selection signals associated with the outer optimizer in question (i.e., “the people doing the best on benchmarks will get their approaches copied, get more funding, etc”). Seems like an AI being aligned to, like, impressing its developers? doing well on benchmarks? getting more funding? becoming the best architecture it can be? IDK what, but it would probably be bad.
So, I don’t see a reason to expect either a sudden capabilities jump (Edit: deriving from the same mechanism as the human sharp left turn), or (undesirable) misalignment.
I wrote:
And then you wrote:
Isn’t that kinda a strawman? I can imagine a lot of scenarios where a training run results in a qualitatively better trained model than any that came before—I mentioned three of them—and I think “5-9OOM more compute than any previous developer” is a much much less plausible scenario than any of the three I mentioned.
This post mainly argues that evolution does not provide evidence for the sharp left turn. Sudden capabilities jumps from other sources, such as those you mention, are more likely, IMO. My first reply to your comment is arguing that the mechanisms behind the human sharp left turn wrt evolution probably still won’t arise in AI development, even if you go up an abstraction level. One of those mechanisms is a 5 − 9 OOM jump in usable optimization power, which I think is unlikely.