TurnTrout comments on TurnTrout’s shortform feed

TurnTrout 1 Jan 2023 19:50 UTC
LW: 11 AF: 4
AF
Are there convergently-ordered developmental milestones for AI? I suspect there may be convergent orderings in which AI capabilities emerge. For example, it seems that LMs develop syntax before semantics, but maybe there’s an even more detailed ordering relative to a fixed dataset. And in embodied tasks with spatial navigation and recurrent memory, there may be an order in which enduring spatial awareness emerges (i.e. “object permanence”).
In A shot at the diamond-alignment problem, I wrote:
[Consider] Let’s Agree to Agree: Neural Networks Share Classification Order on Real Datasets:
We report a series of robust empirical observations, demonstrating that deep Neural Networks learn the examples in both the training and test sets in a similar order. This phenomenon is observed in all the commonly used benchmarks we evaluated, including many image classification benchmarks, and one text classification benchmark. While this phenomenon is strongest for models of the same architecture, it also crosses architectural boundaries – models of different architectures start by learning the same examples, after which the more powerful model may continue to learn additional examples. We further show that this pattern of results reflects the interplay between the way neural networks learn benchmark datasets. Thus, when fixing the architecture, we show synthetic datasets where this pattern ceases to exist. When fixing the dataset, we show that other learning paradigms may learn the data in a different order. We hypothesize that our results reflect how neural networks discover structure in natural datasets.
The authors state that they “failed to find a real dataset for which NNs differ [in classification order]” and that “models with different architectures can learn benchmark datasets at a different pace and performance, while still inducing a similar order. Specifically, we see that stronger architectures start off by learning the same examples that weaker networks learn, then move on to learning new examples.”

Similarly, crows (and other smart animals) reach developmental milestones in basically the same order as human babies reach them. On my model, developmental timelines come from convergent learning of abstractions via self-supervised learning in the brain. If so, then the smart-animal evidence is yet another instance of important qualitative concept-learning retaining its ordering, even across significant scaling and architectural differences.
We might even end up in the world where AI also follows the crow/human/animal developmental milestone ordering, at least roughly up until general intelligence. If so, we could better estimate timelines to AGI by watching how far the AI progresses on the known developmental ordering.
If so, then if a network can act to retrieve partially hidden objects, but not fully hidden objects, then in the next part of training, we can expect the AI to next learn to retrieve objects whose concealment it observed (and also we may expect some additional amount of goal-directedness).
To test this hypothesis, it would be sufficient (but not necessary^[1]) to e.g. reproduce the XLAND results, while taking regular policy network checkpoints. We could behaviorally prompt the checkpoints with tests similar to those administered to human children by psychologists.
1. ^
  The paper indicates that checkpoints were taken, so maybe the authors would be willing to share those for research purposes. If not, rerunning XLAND may be overkill and out of reach of most compute budgets. There are probably simpler experiments which provide evidence on this question.
What links here?