Another possible inflection point, pre-self-improvement could be when an AI gets a set of capabilities that allows it to gain new capabilities at inference time.
Some things like that already happened—bigger models are better at utilizing tools such as in-context learning and chain of thought reasoning. But again, whenever people plot any graph of such reasoning capabilities as a function of model compute or size (e.g., Big Bench paper) the X axis is always logarithmic. For specific tasks, the dependence on log compute is often sigmoid-like (flat for a long time but then starts going up more sharply as a function of log. compute) but as mentioned above, when you average over many tasks you get this type of linear dependence.
Another possible inflection point, pre-self-improvement could be when an AI gets a set of capabilities that allows it to gain new capabilities at inference time.
Some things like that already happened—bigger models are better at utilizing tools such as in-context learning and chain of thought reasoning. But again, whenever people plot any graph of such reasoning capabilities as a function of model compute or size (e.g., Big Bench paper) the X axis is always logarithmic. For specific tasks, the dependence on log compute is often sigmoid-like (flat for a long time but then starts going up more sharply as a function of log. compute) but as mentioned above, when you average over many tasks you get this type of linear dependence.