Yeah I think things like this are reasonable. I think that these are maybe too hard and high-level for a lot of the things I care about—I’m really interested in questions like “how much less reliable is the model about repeating names when the names are 100 tokens in the past instead of 50”, which are much simpler and lower level.
Yeah I think things like this are reasonable. I think that these are maybe too hard and high-level for a lot of the things I care about—I’m really interested in questions like “how much less reliable is the model about repeating names when the names are 100 tokens in the past instead of 50”, which are much simpler and lower level.