Cook Computing

RSS and Bandwidth

June 3, 2002 Written by Charles Cook

Jason Kottke makes a couple of points about the new usage of the <ink> element for automatic discovery of RSS feeds.

(1) I just tested with IE6 and Mozilla and it appears that use of the <link> element does not result in the linked document being downloaded when the main document is displayed in the browser. However this maybe a quirk of the browser configurations on my machine so does not prove anything definitively.

(2) Once the aggregator has extracted the RSS link from the web page I would imagine it would only access the RSS file from then on. Displaying the web page would only happen if the user wanted to look at this after reading the text contained in the RSS file. But in a wider sense the point about minimising bandwidth usage is a valid one. Many people host their web sites on a provider's server and they have to pay for this. In most cases there is a limit on monthly bandwidth usage and if you have a popular blog this could soon be reached, which results in extra cost.

RSS at first sight seems to offer a way round this. The RSS file contains a summary of each item in the <description> element and an associated URL for the content of the item in the <link> element. The content is only downloaded when the user is interested by the description. So you have small RSS files pointing to possibly large files containing content, i.e. low bandwidth usage for accessing RSS files.

But in reality the <description> element is increasingly being used to contain the complete content of each item, so that people can read a blog in their aggregator in preference to reading the content in the blog's web pages. Therefore the bandwidth usage is pretty much the same as if original web pages were viewed, i.e. high bandwidth usage.

It would be preferable if the RSS did not contain the content but people seem to like reading blogs via an aggregator. Therefore if each RSS item could be identified uniquely and containined a URL to the content in a format suitable for viewing in the aggregator (not the same as what the <link> item is currently used for), the aggregator could cache the content for each item and we could return to small RSS files and lower bandwidth usage: the content would still be downloaded by the aggregator but only once. The <description> element could also revert to its proper usage.

(And, regardless of the above, all aggregators should be using etags and the If-Modified-Since HTTP header to reduce the number of times an RSS file is downloaded.)