Page 3 of the PDF has a graph of prediction loss on the OpenAI codebase dataset. It’s hard to link directly to the graph, it’s Figure 1 under the Predictable Scaling section.
You can just append #page=3. This works in most PDF viewers. (There are many query parameters that Adobe supports but that’s really the only one you need to know about.)
OpenAI is, apparently[0], already using GPT-4 as a programming assistant which means it may have been contributing to its own codebase. I think recursive self improvement is a continuous multiplier and I think we’re beyond zero at this point. I think the multiplier is mostly coming from reducing serial bottlenecks at this point by decreasing the iteration time it takes to make improvements to the model and supporting codebases. I don’t expect (many?) novel theoretical contributions from GPT-4 yet.
However, it could also be prompted with items from the Evals dataset and asked to come up with novel problems to further fine-tune the model against. Humans have been raising challenges (e.g. the Millennium problems) for ourselves for a long time and I think LLMs probably have the ability to self-improve by inventing machine-checkable problems that they can’t solve directly yet.
RSI doesn’t require a self authored codebase anyways. RSI can be as simple as “edit a text file to control an existing framework”, where that framework is authored by humans (who used gpt-4 or codex to accelerate authoring it)
Page 3 of the PDF has a graph of prediction loss on the OpenAI codebase dataset. It’s hard to link directly to the graph, it’s Figure 1 under the Predictable Scaling section.
You can just append
#page=3
. This works in most PDF viewers. (There are many query parameters that Adobe supports but that’s really the only one you need to know about.)OpenAI is, apparently[0], already using GPT-4 as a programming assistant which means it may have been contributing to its own codebase. I think recursive self improvement is a continuous multiplier and I think we’re beyond zero at this point. I think the multiplier is mostly coming from reducing serial bottlenecks at this point by decreasing the iteration time it takes to make improvements to the model and supporting codebases. I don’t expect (many?) novel theoretical contributions from GPT-4 yet.
However, it could also be prompted with items from the Evals dataset and asked to come up with novel problems to further fine-tune the model against. Humans have been raising challenges (e.g. the Millennium problems) for ourselves for a long time and I think LLMs probably have the ability to self-improve by inventing machine-checkable problems that they can’t solve directly yet.
[0]: “We’ve also been using GPT-4 internally, with great impact on functions like support, sales, content moderation, and programming.”—https://openai.com/research/gpt-4#capabilities
RSI doesn’t require a self authored codebase anyways. RSI can be as simple as “edit a text file to control an existing framework”, where that framework is authored by humans (who used gpt-4 or codex to accelerate authoring it)