(Fair enough if you never read any of these comments.)
As I’ve noted in all of these comments, people consistently use terminology when making counting style arguments (except perhaps in Joe’s report) which rules out the person intending the argument to be about function space. (E.g., people say things like “bits” and “complexity in terms of the world model”.)
(I also think these written up arguments (Evan’s talk in particular) are very hand wavy, and just provide a vague intuition. So regardless of what he was intending, the actual words of the argument aren’t very solid IMO. Further, using words that rule out the intention of function space doesn’t necessarily imply there is an actually good model behind these words. To actually get anywhere with this reasoning, I think you’d have to reinvent the full argument and think through it in more detail yourself. I also think Evan is substantially wrong in practice though my current guess is that he isn’t too far off about the bottom line (maybe a factor of 3 off). I think Joe’s report is much better in that it’s very clear what level of abstraction and rigor it’s talking about. From reading this post, it doesn’t seem like you came into this project from the perspective of “is there an interesting recoverable intuition here, can we recover or generate a good argument” which would have been considerably better IMO.)
AFAICT Joe also thought this in his report
I think Joe was just operating from a much vaguer counting argument perspective based on my conversations with him about the report and his comments here. As in, he was just talking about the broadly construed counting-argument which can be applied to a wide range of possible inductive biases. As in, for any specific formal model of the situation, a counting-style argument will be somewhat applicable. (Though in practice, we might be able to have much more specific intuitions.)
Note that Joe and Evan have a very different perspective on the case for scheming.
(From my perspective, the correct intuition underlying the counting argument is something like “you only need to compute something which nearly exactly correlates with predicted reward once while you’ll need to compute many long range predictions to perform well in training”. See this comment for a more detailed discussion.)
As I’ve noted in all of these comments, people consistently use terminology when making counting style arguments (except perhaps in Joe’s report) which rules out the person intending the argument to be about function space. (E.g., people say things like “bits” and “complexity in terms of the world model”.)
Aren’t these arguments about simplicity, not counting?
Fair enough if you never read any of these comments.
Yeah, I never saw any of those comments. I think it’s obvious that the most natural reading of the counting argument is that it’s an argument over function space (specifically, over equivalence classes of functions which correspond to “goals.”) And I also think counting arguments for scheming over parameter space, or over Turing machines, or circuits, or whatever, are all much weaker. So from my perspective I’m attacking a steelman rather than a strawman.
I’ve argued multiple times that Evan was not intending to make a counting argument in function space:
In discussion with Alex Turner (TurnTrout) when commenting on an earlier draft of this post.
In discussion with Quintin after sharing some comments on the draft. (Also shared with you TBC.)
In this earlier comment.
(Fair enough if you never read any of these comments.)
As I’ve noted in all of these comments, people consistently use terminology when making counting style arguments (except perhaps in Joe’s report) which rules out the person intending the argument to be about function space. (E.g., people say things like “bits” and “complexity in terms of the world model”.)
(I also think these written up arguments (Evan’s talk in particular) are very hand wavy, and just provide a vague intuition. So regardless of what he was intending, the actual words of the argument aren’t very solid IMO. Further, using words that rule out the intention of function space doesn’t necessarily imply there is an actually good model behind these words. To actually get anywhere with this reasoning, I think you’d have to reinvent the full argument and think through it in more detail yourself. I also think Evan is substantially wrong in practice though my current guess is that he isn’t too far off about the bottom line (maybe a factor of 3 off). I think Joe’s report is much better in that it’s very clear what level of abstraction and rigor it’s talking about. From reading this post, it doesn’t seem like you came into this project from the perspective of “is there an interesting recoverable intuition here, can we recover or generate a good argument” which would have been considerably better IMO.)
I think Joe was just operating from a much vaguer counting argument perspective based on my conversations with him about the report and his comments here. As in, he was just talking about the broadly construed counting-argument which can be applied to a wide range of possible inductive biases. As in, for any specific formal model of the situation, a counting-style argument will be somewhat applicable. (Though in practice, we might be able to have much more specific intuitions.)
Note that Joe and Evan have a very different perspective on the case for scheming.
(From my perspective, the correct intuition underlying the counting argument is something like “you only need to compute something which nearly exactly correlates with predicted reward once while you’ll need to compute many long range predictions to perform well in training”. See this comment for a more detailed discussion.)
Aren’t these arguments about simplicity, not counting?
Yeah, I never saw any of those comments. I think it’s obvious that the most natural reading of the counting argument is that it’s an argument over function space (specifically, over equivalence classes of functions which correspond to “goals.”) And I also think counting arguments for scheming over parameter space, or over Turing machines, or circuits, or whatever, are all much weaker. So from my perspective I’m attacking a steelman rather than a strawman.