I found the beginning of this post very confusing because you don’t seem to be at all acknowledging that the Speed Prior is this specific idea created in 2000 long before AI alignment was a field. (It doesn’t seem like you even reference this paper in the post?) Early in the post, right under the heading “What is the speed prior and why do we care about it?” you say,
The speed prior is a potential technique for combating formation of deceptive alignment.
This is a true statement about the Speed Prior, but it’s not what it is, and it’s emphatically not why it was conceived; instead this is a statement of why we (the alignment community) care about it.
My guess about what happened here would be something like;
Paul and others talked a bunch about the Solomonoff prior and its implications for alignment, occasionally mentioning the Speed Prior as a close cousin to the Solomonoff prior.
Over time, most of why people were talking about the Speed Prior was just from the fact that it’s penalizing computation time (which is an idea that is generally useful for alignment) and not from its formal specification
Evan picked up on this generalized usage
Evan mentored you and transferred the phrase “speed prior” as referring to that general concept.
I think this is a great idea for the alignment community to be developing, but we should do so under a term that doesn’t already refer to something specific outside our field. (I think most of my objection would be ameliorated if you consistently use “a speed prior” and “speed priors”.) I’m not too much of a stickler for freezing the usage of terms, but I was genuinely confused by this usage, and I suspect that other alignment researchers would be too.
I agree that there are many speed priors and that “a speed prior” is probably better than “the speed prior.” That being said, the dovetailing speed prior (which is what Schmidhuber is talking about) is usually what I imagine as the default speed prior (e.g. as in the starting point here).
I found the beginning of this post very confusing because you don’t seem to be at all acknowledging that the Speed Prior is this specific idea created in 2000 long before AI alignment was a field. (It doesn’t seem like you even reference this paper in the post?) Early in the post, right under the heading “What is the speed prior and why do we care about it?” you say,
This is a true statement about the Speed Prior, but it’s not what it is, and it’s emphatically not why it was conceived; instead this is a statement of why we (the alignment community) care about it.
My guess about what happened here would be something like;
Paul and others talked a bunch about the Solomonoff prior and its implications for alignment, occasionally mentioning the Speed Prior as a close cousin to the Solomonoff prior.
Over time, most of why people were talking about the Speed Prior was just from the fact that it’s penalizing computation time (which is an idea that is generally useful for alignment) and not from its formal specification
Evan picked up on this generalized usage
Evan mentored you and transferred the phrase “speed prior” as referring to that general concept.
I think this is a great idea for the alignment community to be developing, but we should do so under a term that doesn’t already refer to something specific outside our field. (I think most of my objection would be ameliorated if you consistently use “a speed prior” and “speed priors”.) I’m not too much of a stickler for freezing the usage of terms, but I was genuinely confused by this usage, and I suspect that other alignment researchers would be too.
I just remembered that we can tag users now; I’ll try tagging @evhub to check with his opinion.
I agree that there are many speed priors and that “a speed prior” is probably better than “the speed prior.” That being said, the dovetailing speed prior (which is what Schmidhuber is talking about) is usually what I imagine as the default speed prior (e.g. as in the starting point here).