Console-based podcatcher – Dual Elephants

A few days ago I installed a new disk in my laptop to replace the existing small one. Being the careful person I am I backed up the important data and then migrated the contents from the old disk to the new one via rsync.Â Everything worked well and I was able to boot into the new disk right away without issue.Â That is until I went to update the podcasts I listen to.

I had forgotten that I keep my podcasts (and the script I used to grab them) on a ZFS partition that I accessed via zfs-fuse (since I access it from linux)Â Being the careful person I am I had forgotten to backup that partition 🙁

As such I was faced with re-constructing the script from memory or rewriting from scratch – I chose option 2.

I remember a while back finding a simple bash-based podcatcher called BashPodder that I figured would be a good base. But the problem was that BashPodder is designed for running from a cronjob on a daily basis and offers no real display of status, no parallelism etc but it did know how to parse a rss feed and that is what I most wanted.

I took that script and then re-designed it for usage as a console application.Â It now will fork a sub-process for each feed and download all the feeds in parallel while providing a status screen showing how much is download, the percent finished, the download rate, the estimated download time remaining and the filename for each feed.

Here is a sample output:

    Down|Perc| Rate| Remain|Filename
-----------------------------------------------------------------
195600K| 70%| 110K| 23m12s|tekzilla--0080--itunes--large.xvid.avi
154550K| 33%| 254K|  1h48m|diggnation--0194--SXSW2009--large.xvid.avi
188050K| 34%| 165K|  1h41m|trs--0104--2years--large.xvid.avi
163100K| 77%| 167K| 15m32s|scamschool--0053--strapped--large.xvid.avi
150700K| 66%| 113K|  27m6s|systm--0095--benchmark--large.xvid.avi

Since I don’t run this from a cronjob and I can’t guarantee all the downloads will finish before I have to go offline it also allows you to press ‘q’ to quit – at which time it tells each sub-process (via a control file) to stop downloading.Â When you next run the script it will continue the download from wherever they were up to.Â As such I no longer download the podcasts into a different directory each day – rather they all end up in a common directory to allow continuing downloads the next day.

Thanks to another control-file – which contains a count of the number of active downloads – the script doesn’t exit until all the sub-processes are done or have exited.Â This control file sits in ~/.bashpodder_logs (which also contains the wget logs for active downloads and the list of files already downloaded)

The script reads from a file called ‘.bashpodder’ in your home directory containing the URL of each feed (comments are allowed) and logs which files have been downloaded to a seperate logfile for each feed (based on a md5 hash of the feedurl – saves having to parse the url to make it filename safe) so as to not re-download them later

Since I lost all my previous data there was no existing list of downloaded files and so the script would proceed to try and download everything in the feeds.Â To stop this I added a ‘catchup’ mode which can be accessed using the ‘-c’ option.Â In this mode the script wouldn’t actually download anything but would act as if it did (and so log that the files were already downloaded). After running the script in catchup mode once I edited each of the logfiles and removed the last one or two entries.Â Then when I finally ran the script normally it started downloading those episodes.

And now I am finally returning to normal!

The script can be see at bashpodder.sh

Here is a sample .bashpodder

4 thoughts on “Console-based podcatcher”

Mobilediesel

February 19, 2010 at 3:32 am

The links for downloading your version of Bashpodder don’t seem to work. I did download it once several days ago but didn’t keep the original when I made some modifications. I can’t believe I forgot to do that.
cjd

February 20, 2010 at 8:15 am

I migrated hosts recently and lost the old file – but I have fixed up the link to my current version.
It is mostly the same but now configuration is in ~/.bashpodder and in that file you can specify where you want to download to
Mobilediesel

February 21, 2010 at 8:34 pm

Nice work! The only problems I had with the previous versions were minor and you fixed them with this version. I never thought of putting the configuration into the podcast file like that. I like it! I immediately changed my script to match! I did have a couple additions to the script, though. One to add a function to add a new feed to ~/.bashpodder and a small change to the xsl function. I don’t know how well the comment box handles code so I put it on pastebin. http://www.pastebin.com/f624b4f9f
- cjd
  
  February 22, 2010 at 8:34 am
  
  Awesome – I tested out your additions and liked them so much that I have added them to my version as well 🙂
  The download now includes the ‘-a’ option to add feeds and also no longer has to temporarily create the xslt file
  
  [WORDPRESS HASHCASH] The poster sent us ‘0 which is not a hashcash value.

Comments are closed.