Tuesday, June 29, 2004

RSS, Shrook and Distributed checking

I have been using a RSS aggregator/reader for about two months, and I really like it. It seems that RSS could be the next way that information is distributed. For example, using a RSS aggregator/reader can be used to get timely news from the different news sources. Also, I can see rss feeds replacing email newsletters. Email newsletters have a problem because the messages may get missed either because people have too many spam messages, or their spam filters are too agressive and filter these news letters. Also rss feeds can help save the bandwidth for a web publisher. The
rss xml can be smaller than the formated html for the pages.

Now it seems that rss feeds will be easier for everyone to use. Apple is including a rss reader in their Safari 2.0 for Mac OS X 10.4. So I think that more and more people will be using rss feeds.

This does cause a problem though. Since most aggrigator/readers check
for new content on a periodic schedule, and that most people will leave
it running, this could cause an increase in the traffic for web
servers. For example, if there are 100 people that read the same rss
feed, and they have their reader setup to check the feeds every 30
minutes, there will be 200 hits every hour. This could be more traffic
than is necessary, especially the content does not get updated all that
often. Also this does not scale well at all.

I am currently using Shrook
as my rss reader. It has an interesting feature called Distributed
Checking. With this feature not all of the clients have to check each
rss feed. Only one client has to check the feed for new content, and
then notifies the other clients. Here is the text from their faq:

So how does it work?

To oversimplify: A central server maintains a database of when each
channel was last updated. To keep it up to date, every so often, the
server chooses a computer to check for new items and report back. The
frequency of this varies from every 5 minutes for popular channels, to
every half hour for channels with only one online subscriber, and it
tries to use a different computer each time. At the other end, each
copy of Shrook checks in with the server every 5 minutes, and if any of
its channels are out of date, it reloads them.

I hope that they will open source this server, and other rss readers will use this.