I mean, IIUC the speed prior still cuts against this, since instead of thinking about training the network could just be allocating more capacity to doing the actual task it’s trained to do. That doesn’t seem to change with additional filler thinking tokens.
Agreed that speed prior still pushes against using a constant fraction of tokens to think about irrelevant stuff.
However, I stand by “it would remove some of the pressure against deceptive alignment coming from speed prior”. In particular, pure forward pass reasoning has capped serial reasoning time with current architectures while tokens don’t which seems potentially important depending on various other considerations.
Doesn’t this already apply partially to the current work? Sure, you constrain the input embeddings to the valid token embedding vectors, but it also attends to the previous residual streams (and not just the tokens they output).
As in “current transformers already think in dense vector spaces”? The main thing is that the serial cognition time prior to some text bottleneck is bounded. I totally agree that transformers do a bunch of unseen cognition in a forward pass, but it seems possible to constrain/bound this within-forward-pass cognition in a bunch of useful ways. We probably can’t importantly constrain the overall cognitive labor an AI lab needs to do if that cognitive labor was done in an entirely uninterpretable way.
(Let me know if this isn’t what you wre trying to talk about, I’m not super confident what you’re pointing at here.)
Agreed that speed prior still pushes against using a constant fraction of tokens to think about irrelevant stuff.
However, I stand by “it would remove some of the pressure against deceptive alignment coming from speed prior”. In particular, pure forward pass reasoning has capped serial reasoning time with current architectures while tokens don’t which seems potentially important depending on various other considerations.
As in “current transformers already think in dense vector spaces”? The main thing is that the serial cognition time prior to some text bottleneck is bounded. I totally agree that transformers do a bunch of unseen cognition in a forward pass, but it seems possible to constrain/bound this within-forward-pass cognition in a bunch of useful ways. We probably can’t importantly constrain the overall cognitive labor an AI lab needs to do if that cognitive labor was done in an entirely uninterpretable way.
(Let me know if this isn’t what you wre trying to talk about, I’m not super confident what you’re pointing at here.)