Charlie Steiner comments on Early situational awareness and its implications, a story

Charlie Steiner 13 Feb 2023 1:57 UTC
2 points
0
The calculations I mean are something like:
- Activate a representation of the text of the spec of its data-cleaning process. This should probably be the same representation it could use to quote from the spec—if it needed some special pre-digested representation, that would make it a lot worse at generalization to untested parts of the spec.
- Predicting its consequences. For this to be simple, it should be using some kind of general consequence-predictor rather than learning how to predict consequences from scratch. Specifically it needs some kind of sub-process that accepts inputs and has outputs ready at a broad range of layers in the network, and can predict the consequences of many things in parallel to be generally useful. If such a sub-process doesn’t exist in some LLM, that’s bad news for that LLM’s ability to calculate consequences at runtime.
- Represent the current situation in a way that’s checkable against the patterns predicted by the spec.
- And still have time left over to do different processing depending on the results of the check—the LLM has to have a general abstraction for what kind of processing it should be doing from here, so that it can generalize to untested implications of the spec.
So that’s all pretty ambitious. I think spreading the work out over multiple tokens requires those intermediate tokens to have good in-text reasons to contain intermediate results (as in “think step by step”).
- Jacob Pfau 14 Feb 2023 22:01 UTC
  1 point
  0
  Parent
  Agree on points 3,4. Disagree on point 1. Unsure of point 2.
  
  On the final two points, and I think those capabilities are already in place in GPT3.5. Any capability/processing which seems necessary for general instruction following I’d expect to be in place by default. E.g. consider what processing is necessary for GPT3.5 to follow instructions on turning a tweet into a Haiku.
  
  On the first point, we should expect text which occurs repeatedly in the dataset to be compressed while preserving meaning. Text regarding the data-cleaning spec is no exception here.