Basically because the strengths of LLMs can be leveraged to do most of the work in each case.
OpenAI’s o1 shows how straightforward reasoning was after reaching GPT4 etc levels of LLMs.
Reflection is merely allowing the system to apply its intelligence to its own cognition; it’s questionable whether that’s worth categorizing as a different ability at all.
Autonomy already exists, and existing LLMs do effectively choose subgoals; they need to get better at it to be useful or dangerous; but they are.
I gavve the rough scheme for all of the types of learning except episodic memory; I won’t say more because I don’t want to improve capabilities. I wouldn’t even say this much if I didn’t strongly suspect that language model agents/cognitive architectures are our best shot at aligned AGI. Suffice it to say that I expect people to be working on the improvement to episodic memory I’m thinking of now and to release it soon.
I happen to have a little inside information that one of the other types of learning is working very well in at least one project.
If you don’t believe me, I ask you: can you be sure they won’t? Shouldn’t we have some plans together just in case I’m right?
I’m not sure they won’t turn out to be easy relative to inventing LLM’s, but under my model of cognition there’s a lot of work remaining. Certainly we should plan for the case that you’re right, though that is probably an unwinnable situation so it may not matter.
The chances of this conversation advancing capabilities are probably negligible—there are thousands of engineers pursuing the plausible sounding approaches. But if you have a particularly specific or obviously novel idea I respect keeping it to yourself.
Let’s revisit the o1 example after people have had some time to play with it. Currently I don’t think there’s much worth updating strongly on.
You don’t think a 70% reduction in error on problem solving is a major advance? Let’s see how it plays out. I don’t think this will quite get us to RREAL AGI, but it’s going to be close.
I couldn’t disagree more with your comment that this is an unwinnable scenario if I’m right. It might be our best chance. I’m really worried that many people share the sentiment you’re expressing, and that’s why they’re not interested in considering this scenario closely. I have yet to find any decent arguments for why this scenario isn’t quite possible. It’s probably the single likeliest concrete AGI scenario we might predict now. It makes sense to me to spend some real effort on the biggest possibility we can see relatively clearly.
Okay, I notice these 2 things are in tension of each other:
I gavve the rough scheme for all of the types of learning except episodic memory; I won’t say more because I don’t want to improve capabilities.
Suffice it to say that I expect people to be working on the improvement to episodic memory I’m thinking of now and to release it soon.
If you think they’re already planning to do this, why do you think that keeping the ideas to yourself would meaningfully advance capabilities by more than a few months?
More generally, I think that there’s a conflict between thinking that other people will do what you fear soon, and not saying what you think they are doing, because it means that if your idea worked, people would try to implement it anyways and you only saved yourself a few months time, which isn’t usually enough for interventions.
I think you’re right that there’s an implied tension here.
If you think they’re already planning on to do this, why do you think that keeping the ideas to yourself would meaningfully advance capabilities by more than a few months?
Because my biggest fear is proliferation, as we’ve been discussing in If we solve alignment, do we die anyway?. The more AGIs we get coming online around the same time, the worse off we are. My hope after the discussion in that post is that whoever is ahead in the AGI race realizes that wide proliferation of fully RSI capable AGI is certain doom, and uses their AGI for a soft pivotal act- with many apologies and assurances to the rest of the world that it will be used for the betterment of all. They’ll then have to do that, to some degree at least, so the results could be really good.
As for only a few months of time bought, I agree. But it would tighten the race and so create more proliferation. As for interventions, I don’t actually think any interventions that have been discussed or will become viable would actually help. I could be wrong on this easily; it’s not my focus.
I see a path to human survival and flourishing that needs no interventions and no work, except that it would be helpful to make sure the existing schemes for aligning language model cognitive architectures (and similar proto-AGIs) to personal intent are actually going to work. They look to me like they will, but I haven’t gotten enough people to actually consider them in detail.
I think a crux here is that even conditional on your ideas working IRL, I think that in practice, there will only be 3-4 AGIs by the big labs, and plausibly limited to Deepmind, OpenAI and Anthropic, because of the power and energy cost of training future AIs will become a huge bottleneck by 2030.
Another way to say it is I think open-source will just totally lose the race to AGI, so I only consider the actors of Meta, Deepmind, Anthropic and OpenAI as relevant actors for several years, so these are the only actors that could actually execute on your ideas even if they proliferated.
Basically because the strengths of LLMs can be leveraged to do most of the work in each case.
OpenAI’s o1 shows how straightforward reasoning was after reaching GPT4 etc levels of LLMs.
Reflection is merely allowing the system to apply its intelligence to its own cognition; it’s questionable whether that’s worth categorizing as a different ability at all.
Autonomy already exists, and existing LLMs do effectively choose subgoals; they need to get better at it to be useful or dangerous; but they are.
I gavve the rough scheme for all of the types of learning except episodic memory; I won’t say more because I don’t want to improve capabilities. I wouldn’t even say this much if I didn’t strongly suspect that language model agents/cognitive architectures are our best shot at aligned AGI. Suffice it to say that I expect people to be working on the improvement to episodic memory I’m thinking of now and to release it soon.
I happen to have a little inside information that one of the other types of learning is working very well in at least one project.
If you don’t believe me, I ask you: can you be sure they won’t? Shouldn’t we have some plans together just in case I’m right?
I’m not sure they won’t turn out to be easy relative to inventing LLM’s, but under my model of cognition there’s a lot of work remaining. Certainly we should plan for the case that you’re right, though that is probably an unwinnable situation so it may not matter.
The chances of this conversation advancing capabilities are probably negligible—there are thousands of engineers pursuing the plausible sounding approaches. But if you have a particularly specific or obviously novel idea I respect keeping it to yourself.
Let’s revisit the o1 example after people have had some time to play with it. Currently I don’t think there’s much worth updating strongly on.
You don’t think a 70% reduction in error on problem solving is a major advance? Let’s see how it plays out. I don’t think this will quite get us to RREAL AGI, but it’s going to be close.
I couldn’t disagree more with your comment that this is an unwinnable scenario if I’m right. It might be our best chance. I’m really worried that many people share the sentiment you’re expressing, and that’s why they’re not interested in considering this scenario closely. I have yet to find any decent arguments for why this scenario isn’t quite possible. It’s probably the single likeliest concrete AGI scenario we might predict now. It makes sense to me to spend some real effort on the biggest possibility we can see relatively clearly.
It’s far from unwinnable. We have promising alignment plans with low taxes. Instruction-following AGI is easier and more likely than value aligned AGI, and the easier part is really good news. There’s still a valid question of If we solve alignment, do we die anyway?, but I think the answer is probably that we don’t- it becomes a political issue, but it is a solvable one.
More on your intuition that integrating other systems will be really hard in the other threads here.
Okay, I notice these 2 things are in tension of each other:
If you think they’re already planning to do this, why do you think that keeping the ideas to yourself would meaningfully advance capabilities by more than a few months?
More generally, I think that there’s a conflict between thinking that other people will do what you fear soon, and not saying what you think they are doing, because it means that if your idea worked, people would try to implement it anyways and you only saved yourself a few months time, which isn’t usually enough for interventions.
I think you’re right that there’s an implied tension here.
Because my biggest fear is proliferation, as we’ve been discussing in If we solve alignment, do we die anyway?. The more AGIs we get coming online around the same time, the worse off we are. My hope after the discussion in that post is that whoever is ahead in the AGI race realizes that wide proliferation of fully RSI capable AGI is certain doom, and uses their AGI for a soft pivotal act- with many apologies and assurances to the rest of the world that it will be used for the betterment of all. They’ll then have to do that, to some degree at least, so the results could be really good.
As for only a few months of time bought, I agree. But it would tighten the race and so create more proliferation. As for interventions, I don’t actually think any interventions that have been discussed or will become viable would actually help. I could be wrong on this easily; it’s not my focus.
I see a path to human survival and flourishing that needs no interventions and no work, except that it would be helpful to make sure the existing schemes for aligning language model cognitive architectures (and similar proto-AGIs) to personal intent are actually going to work. They look to me like they will, but I haven’t gotten enough people to actually consider them in detail.
I think a crux here is that even conditional on your ideas working IRL, I think that in practice, there will only be 3-4 AGIs by the big labs, and plausibly limited to Deepmind, OpenAI and Anthropic, because of the power and energy cost of training future AIs will become a huge bottleneck by 2030.
Another way to say it is I think open-source will just totally lose the race to AGI, so I only consider the actors of Meta, Deepmind, Anthropic and OpenAI as relevant actors for several years, so these are the only actors that could actually execute on your ideas even if they proliferated.