Bloglines2HTML (generate static HTML from Bloglines)

About

Bloglines2HTML is a Python script that fetches subscription list and feeds from Bloglines to generate either a single HTML page, or a directory of pages. It logs into Bloglines using its web services API, fetches the subscription list as OPML and each individual feed as RSS. It would then generate the HTML result using a template, which can then be used to browse locally, pasted into another page, etc.

Requirement

To use Bloglines2HTML, you'll need:

Download

Usage

Run the script with "-h" to display the help message.

$ bloglines2html.py -h
Usage:
    bloglines2html.py [options]

Options:
    -D var=val  Define variable equals to value.
    -f folder   List only feeds inside this folder.
    -h          Display this help message.
    -i template Index template file to be used
    -m          Mark the feeds as read.
    -M          Multi-file support.
                -o must be present and will be a directory.
    -o file     Write the output to this file, instead of stdout.
    -p password Password to log into Bloglines.
    -r          List feeds that have already been read.
    -t template Feed template file to be used
    -u username Username to log into Bloglines. Usually your email address.
    -v          Turn up verbose level.
                -v = info.  -vv = debug.
    -V          Show version information.

"-u username" and "-p password" are required to use the Bloglines web service, unless you have modified the Python source and changed default_username and default_password near the beginning of the file. HTML result would be printed to the standard output, unless "-o filename" is applied, which then would be written to a file.

After the script has fetched the subscription list, it would fetch each feed where there is at least one unread items, unless "-r" is applied, which in this case all feeds would be fetched. You can also limit Bloglines2HTML to fetch only the feeds inside a specific folder, using "-f folder" argument. You might also want to use "-m" to mark all the unread items as read, otherwise they would stay as unread on Bloglines.

Multi-File Mode

Since Bloglines2HTML 0.2, it is possible to split the output into multiple pages. It is useful for some of you who might have hundreds of feeds to chew through everyday, and merging all results into one single HTML is just not practical.

Bloglines2HTML multi-file mode saves each feed and their corresponding entries into separate files, and then create an index file to link them all. To turn it on, apply "-M" when Bloglines2HTML is invoked, and "-o directory" must also be supplied. If target directory does not exist, it will be created.

After feeds have been retrieved, "index.html" will be created inside the target directory, and for each feed, "feed<bloglines sub ID>.html" will also be created inside the same directory.

Templates

Bloglines2HTML uses a simple template system to allow individuals customise the HTML result. The default templates are hard-coded inside the script itself, but one can use an alternative template file with the "-i template" or "-t template" argument.

"-i" provides an alternate "index template" which will only be used in multi-file mode. "-t" provides an alternate "feed template" where it iterates over feeds and dumps all the entries there.

Default Templates

Here are examples of default templates.

Default index template:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 
  Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=<blines:var name="charset" />" />
    <title><blines:var name="title" /></title>
  </head>
  <body>
    <div>
      <div><h1><blines:var name="title" /></h1></div>
      <div>
        <ul>
          <blines:feeds order="folder">
          <li>
            <blines:var name="feed_folder" />: 
            <a href="<blines:feed_link />"><blines:var name="feed_title" /></a> 
            (<blines:var name="feed_BloglinesUnread" />)
          </li>
          </blines:feeds>
        </ul>
      </div>
      <div>
        Generated on <blines:now /> 
        by <a href="/code/bloglines2html/">Bloglines2HTML</a> 
        <blines:var name="version" />
      </div>
    </div>
  </body>
</html>

Default feed template:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=<blines:var name="charset" />" />
    <title><blines:var name="title" /></title>
  </head>
  <body>
    <div>
      <div><h1><blines:var name="title" /></h1></div>
      <div>
        <blines:feeds>
        <div class="feed">
          <h2><a href="<blines:var name="feed_htmlUrl" />"><blines:var name="feed_title" /></a></h2>
          <blines:entries>
          <div class="entry">
            <h3><a href="<blines:var name="entry_link" />"><blines:var name="entry_title" /></a></h3>
            <div class="entrybody"><blines:var name="entry_summary" /></div>
            <p class="posted">Posted by <blines:var name="entry_author" /> on <blines:var name="entry_modified" /></p>
          </div>
          </blines:entries>
          <hr />
        </div>
        </blines:feeds>
      </div>
      <div>
        Generated on <blines:now /> 
        by <a href="/code/bloglines2html/">Bloglines2HTML</a> 
        <blines:var name="version" />
      </div>
    </div>
  </body>
</html>

Template Tags

"Tags" inside the template would be replaced at the run time to inject feed/blog entry related values into the final HTML file. "Tags" look like an XML element with blines namespace, and they can have attributes. Some attributes are required, and some are optional.

The following tags can be used. I -- can be used in index template. F -- can be used in feed template.

<blines:now [format="..."] /> (IF)
Insert the current date/time. Optional attribute format specifies the format string passed to Python's time.strftime.
<blines:feeds>...</blines:feeds> (IF)
Iterate through the feeds. Inside the feed loop, you can insert feed-related variables and entries.
<blines:entries>...</blines:entries> (F)
Iterate through the entries. It must be called inside a feed loop. Inside the entry loop, you can insert entry-related variables.
<blines:var name="..." [format="..."] /> (IF)
Insert the value of a variable, specified with attribute name. If the variable is a date/time, then optional attribute format can be used to change how date/time is formatted.
<blines:if_var name="...">...</blines:if_var> (IF)
Output the code between the tags if variable by name exists.
<blines:if_not_var name="...">...</blines:if_not_var> (IF)
Output the code between the tags if variable by name does not exist.
<blines:feed_link /> (I)
In index template, it creates feed output for the current feed and returns the corresponding filename..

Variables are categorised into general variables, feed-related variables and entry-related variables. One can add extra general variables using the "-D foo=bar" command line argument. For example, if inside the template file I can have:

  The answer is <blines:var name="answer" />.

And after I run bloglines2html.py -D answer=42, the result would produce:

  The answer is 42.

Feed related variables:

feed_title
Title of the feed.
feed_htmlUrl
URL of the blogsite/website.
feed_type
Type of the feed (rss, rdf, atom, etc)
feed_xmlUrl
URL of the feed.
feed_BloglinesUnread
Number of entries that has not been read.
feed_BloglinesSubId
Bloglines subscription ID for this feed.

Entry related variables

entry_title
Title of the entry.
entry_author
Author of the entry.
entry_link
URL of the permalink to the entry.
entry_summary
Text of the entry summary in sanitised HTML.
entry_id
Entry ID if it exists.
entry_modified
When the entry got last modified.
entry_category
Primary category of the entry.

History

  • 0.3 (2006-05-25)
    • Make it compatible with feedparser 4.x.
  • 0.2 (2005-11-17)
    • Add multi-file mode.
  • 0.1 (2004-10-07)
    • Initial release.