This approach appears to be easier than I’d thought. I’ve been expecting this type of self-prompting to imitate the advantages of human thought, but I didn’t expect the cognitive capacities of GPT-4 to make it so easy to do useful multi-step thinking and planning. The ease of initial implementation (something like 3 days, with all of the code also written by GPT-4 for baby AGI) implies that improvements may also be easier than we would have guessed
Having played with both BabyAGI and AutoGPT over the past few days, I’m actually surprised at how hard it is to get them to do useful multistep thinking and planning. Even things that I’d think an LLM would be good at, like writing a bunch of blogposts from a list, or book chapters from an outline, the LLM tends to get off track in a way I wouldn’t expect from the coherency I see in chat interactions where I’m constantly giving the LLM hints about the topic, and can reroll or rewrite if it misunderstands. I think I was underestimating how much work those constant feedback and corrections from me are doing
Idk, I feel about this stuff like I felt about GPT-J. What scares me is not how well it works, but that it kinda/sorta works a bit. It’s a bunch of garbage python code wrapped around an API, and it kinda works. I expect people will push on this stuff hard, and am worried that DeepMind, OpenAI, and Google will be doing so inamuchmore principled way than the wild-west LLM enthusiast crowed.
I think it was wrong for people to take comfort in the meme that “GPT-N is not an agent” and this will become very clear to everyone in the next 18 months.
I agree that it isn’t very impressive out of the box. I think these techniques will improve over time. I’m not sure it’s going to be the next big thing, but I do think it’s worth thinking about the impact on alignment in case it is. As I think more about it, I see several other useful human cognitive capacities that can be emulated in the same way. They’re not arcane, so I expect the group of people hacking away on Auto-GPT to be working on them right now. Time will tell, but we need to get ahead of the curve to have alignment solutions. My prediction is that wrappers will definitely add cognitive capacity, and that they might easily add a lot.
Agreed, and note that there’s substantial economic incentive for people to keep improving it, since a more independently-capable LLM-based agent is useful for more purposes. There are a whole host of startups right now looking for ways to enhance LLM-based systems, and a host of VCs wanting to throw money at them (examples on request, but I’m guessing most people have been seeing it online already).
There are probably thousands of semi-entrepreneurial hackers working on this now. And a hundred thousand in a month. Many of them will share their best ideas. This will move fast, and we will see some of the potential quickly.
Having played with both BabyAGI and AutoGPT over the past few days, I’m actually surprised at how hard it is to get them to do useful multistep thinking and planning. Even things that I’d think an LLM would be good at, like writing a bunch of blogposts from a list, or book chapters from an outline, the LLM tends to get off track in a way I wouldn’t expect from the coherency I see in chat interactions where I’m constantly giving the LLM hints about the topic, and can reroll or rewrite if it misunderstands. I think I was underestimating how much work those constant feedback and corrections from me are doing
Idk, I feel about this stuff like I felt about GPT-J. What scares me is not how well it works, but that it kinda/sorta works a bit. It’s a bunch of garbage python code wrapped around an API, and it kinda works. I expect people will push on this stuff hard, and am worried that DeepMind, OpenAI, and Google will be doing so in a much more principled way than the wild-west LLM enthusiast crowed.
I think it was wrong for people to take comfort in the meme that “GPT-N is not an agent” and this will become very clear to everyone in the next 18 months.
I agree that it isn’t very impressive out of the box. I think these techniques will improve over time. I’m not sure it’s going to be the next big thing, but I do think it’s worth thinking about the impact on alignment in case it is. As I think more about it, I see several other useful human cognitive capacities that can be emulated in the same way. They’re not arcane, so I expect the group of people hacking away on Auto-GPT to be working on them right now. Time will tell, but we need to get ahead of the curve to have alignment solutions. My prediction is that wrappers will definitely add cognitive capacity, and that they might easily add a lot.
Agreed, and note that there’s substantial economic incentive for people to keep improving it, since a more independently-capable LLM-based agent is useful for more purposes. There are a whole host of startups right now looking for ways to enhance LLM-based systems, and a host of VCs wanting to throw money at them (examples on request, but I’m guessing most people have been seeing it online already).
This is an excellent point.
There are probably thousands of semi-entrepreneurial hackers working on this now. And a hundred thousand in a month. Many of them will share their best ideas. This will move fast, and we will see some of the potential quickly.