adamant

Karma: 6

adamant Apr 18, 2025, 8:11 AM
3 points
1
on: Surprising LLM reasoning failures make me think we still need qualitative breakthroughs for AGI
Very informative toy examples. Regarding this point:
> Some kind of failure of spatial reasoning (wandering items, whatever was going on with some of the sliding square chain-of-thoughts where pieces vanished)
I would strongly agree with this. I actually think the sliding block puzzle is a task which might just be easy for humans on account of our strong spatial priors. In the physical world, things move with spatial locality and two objects cannot be in the same place. For the LLM, it is trained on orders of magnitude less data to learn to represent spatial locality in text-based drawings of 2D images. In other words, basic physical reasoning priors which are second nature to us (due to many thousands of hours of training data) may not be fully imbued in the model.
I think the reason I emphasize this is because I worry that any test of generality which invokes 2D space might actually be a test of the strength of spatial priors.
I would love to hear thoughts on (a) whether spatial reasoning is needed to automate coding / get self-improving AI and (b) examples of clear LLM failures on math/logic reasoning which don’t invoke a spatial reasoning prior
(+ hope this is in line with community norms—haven’t really posted on LW much!)

adamant Jun 22, 2023, 3:43 AM
3 points
0
in reply to: mingyuan’s comment on: Guide to rationalist interior decorating
Gotcha, and thank you so much for writing this post!

adamant Jun 21, 2023, 9:14 PM
3 points
3
on: Guide to rationalist interior decorating
Ah! Were you the one who decorated the Rose Garden Inn? I’m really curious how you made the lighting that looks like the sun coming through a cracked door / coming through the cracks between bricks.
Picture below—I took a ton of pictures when I was there to steal your interior decoration ideas.

adamant Jan 16, 2023, 9:55 PM
1 point
0
in reply to: LGS’s comment on: How does GPT-3 spend its 175B parameters?
Aside which the original author may be interested in—there has been some work done to reduce the scaling of the context window below O(n^2) -- e.g. https://arxiv.org/pdf/1904.10509v1.pdf. I also think of OpenAI’s jukebox which uses a hierarchical strategy in addition to factorized self-attention for generating tokens to effectively increase the context window (https://openai.com/blog/jukebox/)