rakaz

about standards, webdesign, usability and open source

Feedparser

Feedparser is a parsing library for Atom 0.3, 1.0 and most RSS flavours, including RSS 0.9.x, 1.0, 1.1 and 2.0. It also supports some basic extensions.

Feedparser 0.4

The whole idea behind this library is to take as much complexity away from the user of this library. You don’t need to know the differences between the competing formats and extensions. All you need to do is a very simple call and process the data.

$parser = new FeedParserURL();
$result = $parser->Parse('http://rakaz.nl/index.atom');

echo $result['feed']['title']['value'];
echo $result['feed']['entries'][0]['link']['href'];

Instead of giving back the ‘raw’ elements such as other libraries, FeedParser will convert everything to an Atom based internal structure. The advantages of this approach is enormous and will simplify development considerably. Of course there is also a downside to this approach: the library needs to know about every single extension, otherwise it will simply ignore it – without giving the user of this library the ability to retrieve the information. Fortunately it is very easy to add a new extension. All you need to do is let the library know about the namespace and create a simple class that parses the elements and attributes of the extension.

$parser = new FeedParserURL();
$parser->addCustomNamespace ('http://backend.userland.com/creativeCommonsRssModule', 'creativeCommons')
$result = $parser->Parse('http://rakaz.nl/index.atom');

class FeedParserExtensionCreativeCommons extends FeedParserHelper {

	function parseElementLicense(& $context, & $tag) {
		$link = array ();
		$link['rel'] = 'license';

		if (isset($tag['value']))
			$link['href'] = $this->_parseUrl($tag['value']);

		$context['links'][] = $link;
	}
}