Sammy Martin comments on “Scaling Laws for Autoregressive Generative Modeling”, Henighan et al 2020 {OA}

Sammy Martin 30 Oct 2020 14:28 UTC
1 point
It may well be a crux—an efficient ‘tree search’ or a similar goal-directed wrapper around a GPT-based system, that can play a role in real-world open-ended planning (presumably planning for an agent to be effecting outcomes in the real world via its text generation), would have to cover continuous action spaces and possible states containing unknown and shifting sets of possible actions (unlike the discrete and small, relative to the real universe, action space of Go which is perfect for a tree search), running (or approximating running) millions of primitive steps (individual text generations and exchanges) into the future (for long-term planning towards e.g. a multi-decade goal like humans are capable of).
That sounds like a problem that’s at least as hard as a language-model ‘success probability predictor’ GPT-N (probably with reward-modelling help, so it can optimize for a specific goal with its text generation). Though such a system would still be highly transformative, if it was human-level at prediction.
To clarify, this is Transformative not ‘Radically Transformative’ - transformative like Nuclear Power/Weapons, not like a new Industrial Revolution or an intelligence explosion.
I would expect tree search powered by GPT-6 to be probably pretty agentic.
I could imagine (if you found a domain with a fairly constrained set of actions and states, but involved text prediction somehow) that you could get agentic behaviour out of a tree search like the ones we currently have + GPT-N + an RL wrapper around the GPT-N. That might well be quite transformative—could imagine it being very good for persuasion, for example.