Search

Creating an Online RSS News Aggregator with ASP.NET Part 1

0 views

With the rise of always-on Internet connections in homes and businesses, and the continued explosive growth of the World Wide Web and Internet-accessible applications, it is becoming more and more important for applications to be able to share data with each other. Sharing data among disparate platforms requires a platform-neutral data format that can be easily transmitted via standard Internet protocols-this is where XML fits in. Since XML files are essentially, simple text files with well-known encodings, and since there exist XML parsers for all commonly used programming languages, XML data can be easily consumed by any platform. A good example of data-sharing using XML is Web site syndication, commonly found in news sites and Web logs. With Web site syndication, a Web site publishes its latest content in an XML-formatted, Web-accessible syndication file. There are a number of syndication formats in use, one of the more popular ones being RSS 2.0. (RSS 2.0 Specification is published online at the Technology at Harvard Law site.) Additionally, MSDN Magazine has a syndication file, MSDN Magazine: Current Issue, which lists the most recent MSDN Magazine articles with links to the online version. Once a Web site has publicly published a syndication file, various clients may decide to consume it. There are a number of ways to consume a syndication file. Someone who runs a .NET resource Web site might want to add the latest MSDN Magazine article headlines on their Web site. Syndication files are also commonly consumed by news aggregator applications, which are applications designed specifically to retrieve and display syndication files from a variety of sources. With this growing emphasis on XML data, being able to work with XML data in an ASP.NET Web page is more pertinent now than ever before. Since Web site syndication is becoming all the rage, in this article we'll build a Web site syndication file generator as well as an online news aggregator application. As we work on these two mini-projects throughout this article we'll examine how to access and display XML data from both a remote Web server and from the local file system. We'll look at how to display XML data in a myriad of ways, such as using a Repeater control and using the ASP.NET XML Web control. Since I do not have limitless space for this article, I will assume that you are currently familiar with XSLT and XPath. If this is not the case, consider reading the following resources before continuing with this article:

  • FAQ: What is XSLT and How Does it Relate to XML?
  • XSL Tutorial
  • XPath Tutorial
  • Syndicating Content with RSS 2.0

    The first mini-application we will be building in this article is a syndication file generator. For this mini-application, imagine that you work as a Web developer for a large news site (like MSNBC.com) where all of the news stories are stored in a Microsoft SQL Server 2000 database. Specifically, the articles are stored in a table called Articles with the following germane fields:
  • ArticleID-an auto-increment primary key integer field uniquely identifying each article.
  • Title-a varchar(50), specifying the title of the news item,
  • Author-a varchar(50), specifying the author of the news item,
  • Description-a varchar(2000), providing a more in-depth description of the news item, and
  • DatePublished-a datetime indicating the date the news item was published.
  • Note that there might be other fields in the Articles table, but those listed above are the only fields we are interested in using for syndication. Furthermore, this is a very simplified data model; in a real-world setting you would likely have a more normalized database, such as having a separate table for authors, a many-to-many table joining authors and articles, and so on. Our next step is to create an ASP.NET Web page that will display a list of the most recent news items as a properly formatted RSS 2.0 XML file. Before examining how to accomplish this transformation in an ASP.NET Web page, let's first take a moment to examine the RSS 2.0 specification. While looking over the specification, keep in mind that RSS is designed to provide a data model to syndicate content. Not surprisingly, then, it has a series of XML elements for information about the Web site syndicating the content, as well as a series of XML elements to describe a particular news item. Finally, don't forget that RSS syndication files, like any XML-formatted file, must adhere to XML formatting guidelines, namely that:
  • All XML elements be properly nested,
  • All attribute values be quoted, and
  • All instances of , &, " and ' be replaced with < , > , & , " and ' , respectively.
  • Furthermore, XML files are case-sensitive, meaning that the opening and closing tags for an XML element must match in case as well as in spelling. The root element in an RSS 2.0 file is the element. You can provide the version number in this element like so: rss version="2.0" ... </rss> The <rss> element has a single child element, <channel>, which describes the syndicated content. Inside the <channel> element there are three required children elements that are used to describe information about the syndicating Web site. These three elements are:
  • title-Specifies the name of the syndication file, and typically includes the Web site's name,
  • link-the URL to the Web site, and
  • description-a short description of the Web site.
  • There are a number of optional elements to describe the Web site as well; see the RSS 2.0 Specification for more information on these elements. Each news item is placed within an individual <item> element. The <channel> element can have an arbitrary number of <item> elements. Each <item> element can have a variety of children elements, the only requirement being that the <item> contain at minimum either the <title> element or the <description> element as a child. A list of the more germane <item> children elements follows: title-the title of the news item, link-the URL to the news item, description-a brief synopsis of the news item, author-the author of the news item, and pubDate-the published date of the news item. A very simple RSS 2.0 syndication file is shown below. You can see another example RSS 2.0 file from RSS generated by Radio UserLand. <rss version="2.0"> &nbsp <channel> &nbsp&nbsp <title>Latest DataWebControls.com FAQs</title> &nbsp&nbsp <link>http://datawebcontrols.com</link> &nbsp&nbsp <description> &nbsp&nbsp&nbsp&nbsp This is the syndication feed for the FAQs &nbsp&nbsp&nbsp&nbsp at DataWebControls.com &nbsp&nbsp </description> &nbsp&nbsp <item> &nbsp&nbsp&nbsp <title>Working with the DataGrid</title> &nbsp&nbsp&nbsp <link>http://datawebcontrols.com/faqs/DataGrid.aspx</link> &nbsp&nbsp&nbsp <pubDate>Mon, 07 Jul 2003 21:00:00 GMT</pubDate> &nbsp </item> &nbsp&nbsp <item> &nbsp&nbsp <title>Working with the Repeater</title> &nbsp&nbsp <description> &nbsp&nbsp&nbsp&nbsp This article examines how to work with the Repeater &nbsp&nbsp&nbsp&nbsp control. &nbsp&nbsp </description> &nbsp&nbsp <link>http://datawebcontrols.com/faqs/Repeater.aspx</link> &nbsp&nbsp <pubDate>Tue 08 Jul 2003 12:00:00 GMT</pubDate> &nbsp&nbsp </item> &nbsp </channel> </rss> One important thing to note here is the <pubDate> element's formatting. RSS requires that the date be formatted according to RFC 822, Date and Time Specification, which starts with an optional three-letter day abbreviation and comma, followed by a required day, then the three-letter abbreviated month, and then the year, followed by a time with time-zone name. Also notice that the <description> child element in the <item> element is optional: the first news item lacks a <description> element, while the second news item has one. *This article originally appeared on the ASP.NET Dev Center at MSDN Scott Mitchell, author of five ASP/ASP.NET books and founder of 4GuysFromRolla.com, has been working with Microsoft Web technologies for the past five years. An active member in the ASP and ASP.NET community, Scott is passionate about ASP and ASP.NET and enjoys helping others learn more about these exciting technologies. For more on the DataGrid, DataList, and Repeater controls, check out Scott's book ASP.NET Data Web Controls Kick Start (ISBN: 0672325012). Read his blog at : http://scottonwriting.net

    Suggest a Correction

    Found an error or have a suggestion? Let us know and we'll review it.

    Share this article

    Comments (0)

    Please sign in to leave a comment.

    No comments yet. Be the first to comment!