mishka comments on Forecasting future gains due to post-training enhancements

mishka 8 Mar 2024 20:37 UTC
3 points
6
Yes, it’s a great topic. The aspect which seems to be missing from “AI capabilities can be significantly improved without expensive retraining”, https://arxiv.org/abs/2312.07413 is that post-training is a particularly fertile ground for rapid turnaround self-modification and recursive self-improvement, as post-training tends to be rather lightweight and usually does not include a delay of training a novel large model.

Some recent capability works in that direction include, for example
- “Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation”, https://arxiv.org/abs/2310.02304
- “Language Agents as Optimizable Graphs”, https://arxiv.org/abs/2402.16823
People who are specifically concerned with rapid foom risks might want to focus on this aspect of the situation. These self-improvement methods currently saturate in a reasonably safe zone, but they are getting stronger both due to novel research, and due to improvements of the underlying LLMs they tend to rely upon.