Thursday, January 15, 2009


I don't understand why so many government sites fail to provide some sort of feed to their daily bulletins. What I am venting about in specific are the Canadian CRTC and FCC sites, every day I have to go to the website and when I reach the content, usually it isn't even HTML but a Word or PDF file. So finally today I decided to do something about it and wrote a little app that scrapes their daily releases and displays the information in blogger so I can just add the feeds to my reader.

The CRTC site specially dissapoints me because someone who developed the site had the common sense of using Dublin Core in the meta tags or are using a CMS which obviously makes it easy to make available machine readable content.

I wrote this system using Python, Beautiful Soup and Google's App. Engine, I will allow comments on both and if there is demand will switch to a PLIGG instance instead of blogger.

As a legal note, I make absolutely no claims or warranties that this application actually works or the following blogs display accurate data from the CRTC or FCC's site.

Here they are:

for the CRTC:
for the FCC:

No comments: