Thanks for writing this up! It seems very helpful to have open, thoughtful discussions about different strategies in this space.
Here is my summary of Anthropic’s plan, given what you’ve described (let me know if it seems off):
It seems likely that deep learning is what gets us to AGI.
We don’t really understand deep learning systems, so we should probably try to, you know, do that.
In the absence of a deep understanding, the best way to get information (and hopefully eventually a theory) is to run experiments on these systems.
We focus on current systems because we think that the behavior they exhibit will be a factor in future systems.
Leaving aside concerns about arms races and big models being scary in and of themselves, this seems like a pretty reasonable approach to me. In particular, I’m pretty on board with points 1, 2, and 3—i.e., if you don’t have theories, then getting your feet wet with the actual systems, observing them, experimenting, tinkering, and so on, seems like a pretty good way to eventually figure out what’s going on with the systems in a more formal/mechanistic way.
I think the part I have trouble with (which might stem from me just not knowing the relevant stuff) is point 4. Why do you need to do all of this on current models? I can see arguments for this, for instance, perhaps certain behaviors emerge in large models that aren’t present in smaller ones. But I’ve never seen, e.g., a list of such things and why they are important or cruxy enough to justify the emphasis on large models given the risks involved. I would really like to see such an argument! (Perhaps it does exist and I am not aware).
I also have a bit of trouble with the “top player” framing—at the moment I just don’t see why this is necessary. I understand that Anthropic works on large models, and that this is on par with what other “top players” in the field are doing. But why not just say that you want to work with large models? Why mention being competitive with Deepmind or OpenAI at all? The emphasis on “top player” makes me think that something is left unsaid about the motivation, aside from the emphasis on current systems. To the extent that this is true, I wish it were stated explicitly. (To be clear, “you” means Anthropic, not Miranda).
Why do you need to do all of this on current models? I can see arguments for this, for instance, perhaps certain behaviors emerge in large models that aren’t present in smaller ones.
I think that Anthropic’s current work on RL from AI Feedback (RLAIF) and Constitutional AI is based on large models exhibiting behaviors that don’t work in smaller models? (But it’d be neat if someone more knowledgeable than me wanted to chime in on this!)
My current best understanding is that running state of the art models is expensive in terms of infrastructure and compute, the next generation models will get even more expensive to train and run, and Anthropic doesn’t have (and doesn’t expect to realistically be able to get) enough philanthropic funding to work on the current best models let alone future ones – so they need investment and revenue streams,
There’s also a consideration that Anthropic wants to have influence in AI governance/policy spaces, where it helps to have a reputation/credibility as one of the major stakeholders in AI work.
Thanks for writing this up! It seems very helpful to have open, thoughtful discussions about different strategies in this space.
Here is my summary of Anthropic’s plan, given what you’ve described (let me know if it seems off):
It seems likely that deep learning is what gets us to AGI.
We don’t really understand deep learning systems, so we should probably try to, you know, do that.
In the absence of a deep understanding, the best way to get information (and hopefully eventually a theory) is to run experiments on these systems.
We focus on current systems because we think that the behavior they exhibit will be a factor in future systems.
Leaving aside concerns about arms races and big models being scary in and of themselves, this seems like a pretty reasonable approach to me. In particular, I’m pretty on board with points 1, 2, and 3—i.e., if you don’t have theories, then getting your feet wet with the actual systems, observing them, experimenting, tinkering, and so on, seems like a pretty good way to eventually figure out what’s going on with the systems in a more formal/mechanistic way.
I think the part I have trouble with (which might stem from me just not knowing the relevant stuff) is point 4. Why do you need to do all of this on current models? I can see arguments for this, for instance, perhaps certain behaviors emerge in large models that aren’t present in smaller ones. But I’ve never seen, e.g., a list of such things and why they are important or cruxy enough to justify the emphasis on large models given the risks involved. I would really like to see such an argument! (Perhaps it does exist and I am not aware).
I also have a bit of trouble with the “top player” framing—at the moment I just don’t see why this is necessary. I understand that Anthropic works on large models, and that this is on par with what other “top players” in the field are doing. But why not just say that you want to work with large models? Why mention being competitive with Deepmind or OpenAI at all? The emphasis on “top player” makes me think that something is left unsaid about the motivation, aside from the emphasis on current systems. To the extent that this is true, I wish it were stated explicitly. (To be clear, “you” means Anthropic, not Miranda).
Your summary seems fine!
I think that Anthropic’s current work on RL from AI Feedback (RLAIF) and Constitutional AI is based on large models exhibiting behaviors that don’t work in smaller models? (But it’d be neat if someone more knowledgeable than me wanted to chime in on this!)
My current best understanding is that running state of the art models is expensive in terms of infrastructure and compute, the next generation models will get even more expensive to train and run, and Anthropic doesn’t have (and doesn’t expect to realistically be able to get) enough philanthropic funding to work on the current best models let alone future ones – so they need investment and revenue streams,
There’s also a consideration that Anthropic wants to have influence in AI governance/policy spaces, where it helps to have a reputation/credibility as one of the major stakeholders in AI work.