Seth Herd comments on Capabilities and alignment of LLM cognitive architectures

Seth Herd 18 Apr 2023 22:14 UTC
3 points
0
Thank you! I’m really glad you found it relatively easy to follow. That was certainly the goal, but it’s impossible to know until others read it.

I think that almost certainly is at least one aspect of Microsoft’s strategy. There was some accidental revealing of an extra internal prompt when bing chat was first released; I didnt take down the source unfortunately. And now that you mention it, that would be a great way to apply Microsoft’s existing skill pool of coders to improve performance.

That might also be a big thing within OpenAI. Sam Altmann recently said (again I don’t remember where) that the age of big models is already over (I’m sure they’ll still make them larger, but he’s saying that’s less of the focus). This could be one of the alternate approaches they’re taking up.

I don’t know how optimistic I am about the alignment prospects. But I do think it’s got more upsides and the same downsides of most other suggestions. Particularly the more practical ones that have a low enough alignment tax to actually be realistically implementable.
- awg 18 Apr 2023 23:15 UTC
  2 points
  1
  Parent
  My degree is in cognitive science, so that might have given me a leg up, but while it was technical I found it got the good points across (with relevant citations) without getting too deep into the weeds, which is always a sweet spot to hit in cases like these :)
  And word, I had written out this short list of (complete ass-pull gut instinct) reasons why I think this will be OpenAI/Microsoft’s strategy before deleting them and reframing it as that wondering instead. But now that you bring it up:
  - The GPT-4 + plugins launch seems like they had that same system internally for quite some time. I think it was part of their testing plan for a while and they built their own, better versions of AutoGPT and decided it was all still pretty dumb and safe and so they released to see what the public would do with them, just like their safety plan states.
  - The slow roll of GPT-4.2+ also points in that same direction to me. They could be using GPT-4.X as the central LLM reasoning hub in their own version of AutoGPT and figure (maybe correctly??) that bottlenecking the performance of the central hub is the proper way to handicap any layman approaches to AutoGPT. I expect they’re testing the performance of GPT-4.2/3/4/whatever in stages for this very purpose before rolling out any updates to the public.
  - Sam A’s comment that the age of large models is over indicates to me that they might be going with this approach: refining/progressing the LLM central hub to a point where it can orchestrate human-level+ STEM across a distributed system of other models.
  And re: alignment, agreed.