Gotcha, and thank you so much for writing this post!
adamant
Karma: 6
Ah! Were you the one who decorated the Rose Garden Inn? I’m really curious how you made the lighting that looks like the sun coming through a cracked door / coming through the cracks between bricks.
Picture below—I took a ton of pictures when I was there to steal your interior decoration ideas.
Aside which the original author may be interested in—there has been some work done to reduce the scaling of the context window below O(n^2) -- e.g. https://arxiv.org/pdf/1904.10509v1.pdf. I also think of OpenAI’s jukebox which uses a hierarchical strategy in addition to factorized self-attention for generating tokens to effectively increase the context window (https://openai.com/blog/jukebox/)
Very informative toy examples. Regarding this point:
> Some kind of failure of spatial reasoning (wandering items, whatever was going on with some of the sliding square chain-of-thoughts where pieces vanished)
I would strongly agree with this. I actually think the sliding block puzzle is a task which might just be easy for humans on account of our strong spatial priors. In the physical world, things move with spatial locality and two objects cannot be in the same place. For the LLM, it is trained on orders of magnitude less data to learn to represent spatial locality in text-based drawings of 2D images. In other words, basic physical reasoning priors which are second nature to us (due to many thousands of hours of training data) may not be fully imbued in the model.
I think the reason I emphasize this is because I worry that any test of generality which invokes 2D space might actually be a test of the strength of spatial priors.
I would love to hear thoughts on (a) whether spatial reasoning is needed to automate coding / get self-improving AI and (b) examples of clear LLM failures on math/logic reasoning which don’t invoke a spatial reasoning prior
(+ hope this is in line with community norms—haven’t really posted on LW much!)