Meta: are you republishing this piece from somewhere else? I subscribe to LW (and EAF) with RSS and over the past few days I’ve had all of your previous posts inserted into my feed three times. Is this likely to be some issue with LW, or an integration with your personal blog?
It’s an issue with LW. Since RSS doesn’t provide unique IDs for posts, we are currently determining whether a post is new in an RSS feed on the basis of its link-field, which seems to have changed two times on Katja’s blog for some reason in the past two days (a bunch of wordpress settings influence this, so she likely changed some setting somewhere).
This definitely isn’t Katja’s fault, and we should improve our algorithm to figure out whether a post in an RSS feed has already been marked as imported.
I notice that I am confused, as RSS has a guid field for precisely this purpose. Is it that LW’s RSS generation does not include it, or is it some other site producing the RSS?
Oh yeah, I remember experimenting with that, though ended up running into similar problems as comparing links for the wordpress case. I remember the ID changing depending on some kind of context, though I don’t remember the exact thing (this code was some of the first code I wrote for the new LessWrong, so it’s been a while).
I do think this is a pretty straightforwardly solvable problem, we just haven’t put much effort into it, since it hasn’t been much of a problem in the past.
Is it that LW’s RSS generation does not include it, or is it some other site producing the RSS?
This is talking about RSS imports, so we are consuming an RSS feed from an external site, and parsing it into a LessWrong post. So we don’t really have control over what data is available.
Meta: are you republishing this piece from somewhere else? I subscribe to LW (and EAF) with RSS and over the past few days I’ve had all of your previous posts inserted into my feed three times. Is this likely to be some issue with LW, or an integration with your personal blog?
It’s an issue with LW. Since RSS doesn’t provide unique IDs for posts, we are currently determining whether a post is new in an RSS feed on the basis of its link-field, which seems to have changed two times on Katja’s blog for some reason in the past two days (a bunch of wordpress settings influence this, so she likely changed some setting somewhere).
This definitely isn’t Katja’s fault, and we should improve our algorithm to figure out whether a post in an RSS feed has already been marked as imported.
I notice that I am confused, as RSS has a guid field for precisely this purpose. Is it that LW’s RSS generation does not include it, or is it some other site producing the RSS?
Oh yeah, I remember experimenting with that, though ended up running into similar problems as comparing links for the wordpress case. I remember the ID changing depending on some kind of context, though I don’t remember the exact thing (this code was some of the first code I wrote for the new LessWrong, so it’s been a while).
I do think this is a pretty straightforwardly solvable problem, we just haven’t put much effort into it, since it hasn’t been much of a problem in the past.
This is talking about RSS imports, so we are consuming an RSS feed from an external site, and parsing it into a LessWrong post. So we don’t really have control over what data is available.