Activate a representation of the text of the spec of its data-cleaning process. This should probably be the same representation it could use to quote from the spec—if it needed some special pre-digested representation, that would make it a lot worse at generalization to untested parts of the spec.
Predicting its consequences. For this to be simple, it should be using some kind of general consequence-predictor rather than learning how to predict consequences from scratch. Specifically it needs some kind of sub-process that accepts inputs and has outputs ready at a broad range of layers in the network, and can predict the consequences of many things in parallel to be generally useful. If such a sub-process doesn’t exist in some LLM, that’s bad news for that LLM’s ability to calculate consequences at runtime.
Represent the current situation in a way that’s checkable against the patterns predicted by the spec.
And still have time left over to do different processing depending on the results of the check—the LLM has to have a general abstraction for what kind of processing it should be doing from here, so that it can generalize to untested implications of the spec.
So that’s all pretty ambitious. I think spreading the work out over multiple tokens requires those intermediate tokens to have good in-text reasons to contain intermediate results (as in “think step by step”).
Agree on points 3,4. Disagree on point 1. Unsure of point 2.
On the final two points, and I think those capabilities are already in place in GPT3.5. Any capability/processing which seems necessary for general instruction following I’d expect to be in place by default. E.g. consider what processing is necessary for GPT3.5 to follow instructions on turning a tweet into a Haiku.
On the first point, we should expect text which occurs repeatedly in the dataset to be compressed while preserving meaning. Text regarding the data-cleaning spec is no exception here.
The calculations I mean are something like:
Activate a representation of the text of the spec of its data-cleaning process. This should probably be the same representation it could use to quote from the spec—if it needed some special pre-digested representation, that would make it a lot worse at generalization to untested parts of the spec.
Predicting its consequences. For this to be simple, it should be using some kind of general consequence-predictor rather than learning how to predict consequences from scratch. Specifically it needs some kind of sub-process that accepts inputs and has outputs ready at a broad range of layers in the network, and can predict the consequences of many things in parallel to be generally useful. If such a sub-process doesn’t exist in some LLM, that’s bad news for that LLM’s ability to calculate consequences at runtime.
Represent the current situation in a way that’s checkable against the patterns predicted by the spec.
And still have time left over to do different processing depending on the results of the check—the LLM has to have a general abstraction for what kind of processing it should be doing from here, so that it can generalize to untested implications of the spec.
So that’s all pretty ambitious. I think spreading the work out over multiple tokens requires those intermediate tokens to have good in-text reasons to contain intermediate results (as in “think step by step”).
Agree on points 3,4. Disagree on point 1. Unsure of point 2.
On the final two points, and I think those capabilities are already in place in GPT3.5. Any capability/processing which seems necessary for general instruction following I’d expect to be in place by default. E.g. consider what processing is necessary for GPT3.5 to follow instructions on turning a tweet into a Haiku.
On the first point, we should expect text which occurs repeatedly in the dataset to be compressed while preserving meaning. Text regarding the data-cleaning spec is no exception here.