Using RSS to Retrieve News Feeds

 


Overview

The Rich Site Summary (RSS) format is used for distributing news headlines on the Web. The following procedure explains the idiosyncratic way in which I set up setgetweb to acquire RSS feeds and display them on a web page. The instructions assume a Linux system running an Apache web server.

 

Procedure

  1. Verify you have Perl v5.003 or greater on your system...

    perl -v

  2. Install modules XML::Parser and XML::RSS...

    perl -MCPAN -e "install XML::Parser"
    perl -MCPAN -e "install XML::RSS"

  3. Create the Perl script, rsshtml.pl.

    rsshtml.pl generates HTML using the the RDF type of RSS feed. Here are some usage examples...

    rsshtml.pl http://slashdot.org/slashdot.rdf > slashdot.html
    rsshtml.pl http://freshmeat.net/backend/fm.rdf > freshmeat.html

  4. Create the Python script rss.py.

    rss.py can handle several different flavors of RSS feed. Here are some usage examples:

    python ./rss.py http://rss.news.yahoo.com/rss/topstories > yahoo.out
    python ./rss.py http://xml.metafilter.com/rss.xml > yahoo.out

  5. Format the output.

    I use the rss.sh shell script to format output.

  6. Configure rss.sh to run from cron every hour.

    For example...

    01 *  *  *  *  /var/www/tech/perl/rss.sh > /tmp/rss.log
    

  7. Enable your web site to use Server Side Includes by tweaking httpd.conf.

  8. Enable your web pages to use Server Side Includes by making them executable. For example...

    chmod 750 index.html

  9. Aggregate and render your RSS content on a web page.

    For example:

    <!--#include virtual="tech/perl/slashdot.html" -->
    <!--#include virtual="tech/perl/metafilter.html" -->

 


Last update: Sunday, 04-Oct-2020 07:12:40 PDT