I have a list of postings sans dates. Every X days cron runs and the head of the list is popped off into the RSS feed.
I have a list of postings with dates. Whenever somebody tries to read an RSS feed, I return the entries within the appropriate time window.
IOW, my approach doesn’t store any server-side state. All the state is in the feed URL (specifying the start date). The query is something like:
SELECT (original_post_date-first_post_date+feed_url_date), title, etc.
FROM posts
WHERE original_post_date<(now()-feed_url_date+first_post_date)
ORDER BY original_post_date DESC
LIMIT size_of_feed -- a constant, like 20
Et voila. No cron. No “list”. No “feed” to have things “popped into”. If ten thousand people subscribe, there is no additional data added to a database or written to disk anywhere. And since the database is read-only, you can replicate and load-balance the service to your heart’s content.
In addition, my approach can be trivially extended to use an etag or a last-modified date that contains the date of the next post, and then avoid doing the query at all if that date hasn’t been reached yet. (Most RSS clients support sending back an ETag or If-modified-since header containing the information from the last query, so that they can skip reparsing—and this would allow the system to simply say, “nah, nothing’s changed” and not re-run the query.)
And it’s still scalable via replication—you can have as many clones running as you want, and they’ll all answer the same thing about the given feed URL (within the accuracy of their clock synchronization, of course).
Et voila.
Actually, this approach is so simple that you don’t even need a real SQL database—Google App Engine’s simple database API would suffice. Heck, the “database” itself is probably small enough to be embedded entirely within the source code, if you did a titles-only feed. ;-)
Easier? Hm?
I have a list of postings sans dates. Every X days
cron
runs and the head of the list is popped off into the RSS feed.I have a list of postings with dates. Whenever somebody tries to read an RSS feed, I return the entries within the appropriate time window.
IOW, my approach doesn’t store any server-side state. All the state is in the feed URL (specifying the start date). The query is something like:
Et voila. No cron. No “list”. No “feed” to have things “popped into”. If ten thousand people subscribe, there is no additional data added to a database or written to disk anywhere. And since the database is read-only, you can replicate and load-balance the service to your heart’s content.
In addition, my approach can be trivially extended to use an etag or a last-modified date that contains the date of the next post, and then avoid doing the query at all if that date hasn’t been reached yet. (Most RSS clients support sending back an ETag or If-modified-since header containing the information from the last query, so that they can skip reparsing—and this would allow the system to simply say, “nah, nothing’s changed” and not re-run the query.)
And it’s still scalable via replication—you can have as many clones running as you want, and they’ll all answer the same thing about the given feed URL (within the accuracy of their clock synchronization, of course).
Et voila.
Actually, this approach is so simple that you don’t even need a real SQL database—Google App Engine’s simple database API would suffice. Heck, the “database” itself is probably small enough to be embedded entirely within the source code, if you did a titles-only feed. ;-)