LANGUAGES: VB.NET | C#
ASP.NET VERSIONS: 2.0+
Turn RSS Feeds into Standard Data Sources
By Steve C. Orr
The World Wide Web is overflowing with potentially useful data. In addition to the HTML output with which we are all familiar, many Web sites expose content and data in other consumable formats, as well. In the following paragraphs you ll learn how to use the components contained within the free, open source RSS Toolkit to manipulate RSS data feeds.
What Is RSS?
Really Simple Syndication (RSS) is an XML-based format that provides a simple way to publish new content notifications along with summaries about that content. There s a nearly infinite supply of RSS feeds available on the Internet. When a user visits a Web site and sees a symbol similar to Figure 1, it s an indication that an RSS feed is available to which they may want to subscribe. Figure 2 shows a sample RSS feed.
Figure 1: This little symbol indicates to users that the Web site they are viewing has an RSS feed available.
Figure 2: An example RSS feed.
These days many companies, software applications, and Web sites also utilize RSS feeds from around the Web to provide topical content to their users. In fact, several of today s most visited Web sites rely almost entirely on the content of others. For example, successful newcomers like Feedburner, Digg, and Reddit continue to prove that Google doesn t have a monopoly on useful perspectives of the Web we all share.
RSS has been the de facto standard content distribution format since the early days of XML. Its star status has recently been propelled to a whole new level with the emergence of Service Oriented Architectures (SOA) and Cloud Computing. The latest Web service APIs on this front commonly support the option of having resulting data returned in RSS format. Such recent developments are pushing the RSS format beyond its original syndication purposes and into a more general purpose data fulfillment role.
Consuming RSS Feeds
Now that RSS feeds are becoming a common data source, questions begin to surface about the best ways to work with it programmatically.
Because RSS is essentially just text that has been formatted in very specific ways, basically any brute force parsing technique could be used to extract needed bits of data. However, this is rarely an optimal approach.
RSS is an XML-based format. XML has been ubiquitous for so long now that there are almost too many options for processing XML data. Even the most obscure platforms and programming languages are bound to have a variety of time-saving XML libraries at their disposal. The .NET Framework contains most such functionality inside the System.XML and System.Data namespaces.
While nearly any generic XML library could likely prove to be helpful for processing RSS data, in most cases custom code must still be added to support the more granular superset of the RSS format.
In last month s column I demonstrated that XSLT can be used for transforming raw RSS data into other formats (such as HTML) that are more attractive to end users (see XML Transformations). However, in most cases, XSLT still is a fairly laborious undertaking when the goal is to simply work with RSS data as if it were coming from any other standard data source (like a database).
ASP.NET includes several controls that provide an excellent model for working with data originating from disparate sources. For example, the SQLDataSource, ObjectDataSource, and SiteMapDataSource controls all process varying data structures into more standard ones. This useful encapsulation allows developers to consistently employ common data processing techniques (like data binding, DataSets, etc.), regardless of the original source of the data.
This analysis leads to the conclusion that a carefully designed RssDataSource control would be the optimal solution for ASP.NET developers working with RSS data. Unfortunately, Microsoft has not provided such a control. Fortunately, an enterprising developer from their ASP.NET development team took the initiative to develop one on his own.
The RSS Toolkit
Dmitry Robsman is the creator of the free RSS Toolkit, which holds the RssDataSource Web control as its centerpiece. He generously donated the toolkit s code to the open source community via http://CodePlex.com. Its freely available C# source code is compatible with ASP.NET 2.0 and above. Several code samples are included to help you get started. Because this tool was not officially provided by Microsoft, it also is not officially supported by Microsoft.
After downloading the RSS Toolkit from http://www.codeplex.com/ASPNETRSSToolkit, you need only add the included RSSToolkit.dll to your Visual Studio toolbox. This can be accomplished via the Choose Items option on the toolbox s right-click menu, which enables you to then browse to the DLL (which can be found in the toolkit project s bin folder).
While the DLL can optionally be registered in the Global Assembly Cache (GAC), it can just as easily be placed in a project s bin folder which makes XCopy deployment a snap. The RSS Toolkit was designed to work in medium trust scenarios, so using it on a shared host won t typically be an issue. Such hosts need only support remote outbound HTTP requests, which is a relatively common find.
The RssDataSource Control
As you d expect, the RssDataSource control has a familiar design that is consistent with the data source controls included with ASP.NET. A basic ASPX declaration is all that s needed to specify the source location of the RSS data feed (via its Url property). The only other property is MaxItems, which was recently added to provide an ability to limit the number of items retrieved from the specified feed:
Standard data binding techniques can then be implemented to bind controls to this data source, such as the purely declarative example shown here that binds a GridView control to the RssDataSource control declared above:
With just a splash of color applied, the two simple declarations above result in the output displayed in Figure 3.
Figure 3: Two simple ASPX declarations are all it takes to display potentially useful data from a remote RSS feed.
All the familiar Visual Studio data source wizards work just as you d expect with this new data source control. Available data columns are automatically fetched from the remote data feed at design time. The resulting data fields can all be adjusted in any way imaginable, just as if they were coming from any other standard data source.
Figure 4: The RSS feed s available data fields are automatically retrieved and fully editable.
In addition to the declarative techniques mentioned above, RSS feeds also can be retrieved and bound programmatically. The following C# code snippet shows how the RSS Toolkit s RssDocument class can be used to accomplish this:
string sLoc = "http://SteveOrr.net/rss.aspx";
RssToolkit.Rss.RssDocument rss =
Image1.ImageUrl = rss.Channel.Image.Url;
Repeater1.DataSource = rss.SelectItems();
This technique can be especially useful for non-Web applications, such as a command line or Windows Forms application.
RSS feeds can be retrieved via URL (as shown) or loaded directly from an XmlReader, or even a string. Once a data feed has been loaded into the RssDocument object, it can be converted directly to a DataSet using the ToDataSet method. It also can be exported to several supported XML formats using its ToXml method.
Because retrieving and processing remote XML files can be rather processor intensive, it s a good thing that an efficient caching mechanism has been built in to the control. Instead of continually fetching the remote RSS feed upon each page request, this data can instead be retrieved from a local cache. This local cache is kept in memory and also is persisted to disk (so it can be utilized even after processing restarts).
Related configuration values can be adjusted in the appSettings section of the web.config file. Below, the time-to-live value is set to 30 minutes. This default will be used in cases where the RSS data does not explicitly provide such a value. If no value is specified, the default will be 1 minute:
The second value (specified by the rssTempDir key) can be used to configure the location where the local disk cache should be stored. Cache files saved in this location can be identified by their .feed file extension. In most cases, this configuration value is not strictly necessary because the RssDataSource control contains logic that will automatically find a usable temp directory on the server.
The RssHyperlink Control
Aside from the RssDataSource control, the RssHyperlink is the only other Web control included with the RSS Toolkit. A hyperlink is displayed for each RssHyperlink control placed on a page, enabling users to view the RSS feed associated with that control.
Additionally, the existence of one or more RssHyperlink controls on a page also serves to inform modern Web browsers that RSS feeds are available for the current Web site. The browser s RSS symbol shown in Figure 1 then springs to life, allowing users to easily subscribe to the feed(s) in a consistent and familiar way. The RssHyperlink control implements this feature by automatically placing a link tag in the header section of the page s HTML. Such standardized RSS link tags look similar to this:
title="Some Web Site's RSS Feed"
The RssHyperlink control s optional ChannelName property can be used to specify a particular channel within the RSS feed, if applicable:
ID="RssHyperLink1" runat="server" ChannelName="FAQ" IncludeUserName="True" NavigateUrl="MyRss.xml"> Click Here For RSS
Click Here For RSS
The optional IncludeUserName property can be used to pass the user s credentials to the RSS feed. The default value of the Boolean IncludeUserName property is false. When this property is set to true, any forms authentication credentials associated with the user will be passed to the RSS feed, allowing a custom view of the data based on the user s profile and/or preferences. More specifically, an encrypted, Base64 encoded version of the user s FormsAuthenticationTicket will be passed to the feed via querystring. Programmatically generated feeds can then utilize the RSS Toolkit s RssHttpHandlerBase class to extract the credentials and filter the data appropriately.
The RSS Toolkit provides a variety of other noteworthy features that cannot be covered here in much detail because of space limitations.
For example, a build provider is included that can automatically create a strongly typed object model of any RSS feed. Of course, the main benefits of strongly typed objects include design-time IntelliSense, improved performance, and the ability to catch many common errors at compile time instead of run time.
While consuming RSS feeds is certainly a valuable feature, publishing your own feeds can be just as valuable. With this in mind, the object model included within the RSS Toolkit can also assist in the programmatic creation of custom RSS feeds.
The latest (2.0) version of the RssDataSource control also supports a few variations of the RSS syndication format. These variations (ATOM, RDF, and OPML) are automatically detected and encapsulated by the control. The RssDocument class described earlier can be used to convert between these formats.
Other new 2.0 features include automatic conversion from relative URLs to absolute URLs. Images and RSS extensions are also now fully supported.
RSS has long been the standard when it comes to syndication formats on the Web. As Service Oriented Architectures continue to exert their growing dominance in the software development world, RSS is becoming ever more prominent.
The ubiquity of RSS makes it a worthy investment of limited development resources. By mastering this standard data exchange format, it will be easier than ever to swap content and data with Web sites and companies scattered across the globe.
The free, open source RSS Toolkit makes this easier than ever with the code and controls contained within. The RssDataSource control makes it easy to bind standard ASP.NET controls to RSS feeds in familiar and intuitive ways. Its caching capabilities ensure that such functionality needn t become a performance bottleneck. The RSS Toolkit and its built-in RssHyperlink control also help to ease pains associated with publishing custom RSS feeds.
Now that you ve had a thorough initiation into this realm, I encourage you to continue your journey by exploring the useful resources listed in the References sidebar.
Steve C. Orr is an ASPInsider, MCSD, Certified ScrumMaster, Microsoft MVP in ASP.NET, and author of Beginning ASP.NET 2.0 AJAX (Wrox). He s been developing software solutions for leading companies in the Seattle area for more than a decade. When he s not busy designing software systems or writing about them, he can often be found loitering at local user groups and habitually lurking in the ASP.NET newsgroup. Find out more about him at http://SteveOrr.net or e-mail him at mailto:[email protected].
- RSS Toolkit home page: http://www.codeplex.com/ASPNETRSSToolkit
- Dmitry Robsman s blog: http://blogs.msdn.com/dmitryr/
- RSS Specification: http://validator.w3.org/feed/docs/rss2.html
- RSS & ASP.NET: http://SteveOrr.net/articles/RSS.aspx
- My RSS Feed: http://SteveOrr.net/rss.aspx
- MSDN RSS Feed: http://www.microsoft.com/feeds/msdn/en-us/rss.xml
- Yahoo RSS Feeds: http://news.yahoo.com/rss