Lukas Finnveden comments on Before smart AI, there will be many mediocre or specialized AIs

Lukas Finnveden Jan 6, 2025, 2:14 PM
LW: 4 AF: 2
2
AF
I suspect there’s a cleaner way to make this argument that doesn’t talk much about the number of “token-equivalents”, but instead contrasts “total FLOP spent on inference” with some combination of:
- “FLOP until human-interpretable information bottleneck”. While models still think in English, and doesn’t know how to do steganography, this should be FLOP/forward-pass. But it could be much longer in the future, e.g. if the models get trained to think in non-interpretable ways and just outputs a paper written in English once/week.
- “FLOP until feedback” — how many FLOP of compute does the model do before it outputs an answer and gets feedback on it?
  - Models will probably be trained on a mixture of different regimes here. E.g.: “FLOP until feedback” being proportional to model size during pre-training (because it gets feedback after each token) and then also being proportional to chain-of-thought length during post-training.
  - So if you want to collapse it to one metric, you’d want to somehow weight by number of data-points and sample efficiency for each type of training.
- “FLOP until outcome-based feedback” — same as above, except only counting outcome-based feedback rather than process-based feedback, in the sense discussed in this comment.
Having higher “FLOP until X” (for each of the X in the 3 bullet points) seems to increase danger. While increasing “total FLOP spent on inference” seems to have a much better ratio of increased usefulness : increased danger.
In this framing, I think:
- Based on what we saw of o1′s chain-of-thoughts, I’d guess it hasn’t changed “FLOP until human-interpretable information bottleneck”, but I’m not sure about that.
- It seems plausible that o1/o3 uses RL, and that the models think for much longer before getting feedback. This would increase “FLOP until feedback”.
- Not sure what type of feedback they use. I’d guess that the most outcome-based thing they do is “executing code and seeing whether it passes test”.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer