podarchive
This is a simple script for archiving podcasts, including show notes.
I made this because I wanted to download and archive some podcasts I like, but wasn't satisfied with the solutions that already exist. None of them seemed to save the show notes or let me customize the names of the downloaded files to my liking, so I made podarchive to solve that.
Behaviour
The script takes the URL of an RSS feed and a (local) target directory as arguments. The feed will be downloaded to the specified directory (as feed.rss) and then used to download the episodes. Audio files are downloaded and not altered, only renamed. For each episode, an HTML file will be created. It embeds the audio file and also contains the show notes.
The filename is used to determine if an episode has already been downloaded. Episodes will not be downloaded again, unless the -f|--force flag is set. The script will decide how it would name the file, then see if that file already exists. That means it will not recognize episodes that were downloaded with different settings (like -d|--date or -e|--episode-number).
podarchive will also create an index.html file containing a list of all episodes, as well as links to their individual HTML files. This can be disabled using the --no-overview flag. Any index.html that already exists in the target directory will be overwritten.
Usage
[perl] ./podarchive.pl [options] rss_feed_url target_directory
Options
You can try them out by using the -n|--dry-run flag
--date, -d Prepend the publishing date to the filename for improved sorting.
This is useful for all podcasts that don't neatly include the
number in every episode's title.
--episode-number, -e Prepend the episode number to the title for improved sorting. This
is less reliable than using the publishing date, but will look a lot
cleaner if successful. If -e and -d are used in conjunction, the
date will be placed in front of the number.
--date-behind Append instead of prepend the date. Implies -d.
--no-overview Don't create an index.html file containing an overview of all episodes.
--verbose, -v Display more information about what's happening, e.g. the exact file
names being written to.
--quiet, -q Only display errors. Use this when running from cron.
--dry-run, -n Display what would happen without doing it
--keep, -k Don't refresh the feed file if it has already been downloaded
--force, -f Force re-downloading of all episodes regardless of whether they've already
been downloaded. This does not override -n | --dry-run and can be used in
conjunction with it.
--help, -h Display this help