Skip to content

tylantz/rm_ads

Repository files navigation

rm_ads

Remove advertisement sections from the Rest is History podcast episodes, and serve an RSS feed and ad-free episodes using Caddy. The impetus for this project was more exploratory than anything else, but it works well.

The methods employed are fast Fourier transform and gap analysis using scipy and librosa, respectively. This implementation is not particularly generalizable to other broadcasts.

This code is not intended to be used for any purpose other than personal use and is not intended to harm any creators in any way.

Security through obscurity - 😅

Because my favorite podcast app does not support authorization, the following measures are taken to make the files extremely difficult to find for anyone but the intended audience. Note that the ${DL_PATH_PARENT} is effectively a password and should be treated appropriately.

  • The caddy file server serves a root directory with only one subdirectory
  • The subdirectory is where the episodes are stored and must be named the ${DL_PATH_PARENT} environment variable
  • The name of the episode directory is practically impossible to guess (e.g., a UUID)
  • Because the directory name is obscure, the file names can be normal

For example:

/ root served by Caddy
└── bingo-bango-bongo-uuid
    ├── feed.xml
    └── episode_1.mp3

The resulting feed url is http(s)://${DOMAIN}/${DL_PATH_PARENT}/feed.xml

Do not use the browse directive in the Caddyfile. If you do, the super-secret directory name will be visible for all peering eyes 👀.

Usage

A command-line interface is provided to run the script and each argument has a corresponding environment variable. See the docker-compose file for an example of how to set these variables.

Usage: python -m rm_ads [OPTIONS]

  Download and process the latest episodes from the podcast RSS feed.

Options:
  --log-level [DEBUG|INFO|WARNING|ERROR|CRITICAL]
                                  Set log level.
  --log-path PATH                 File path to write logs to. If a directory
                                  is provided, rotating log files will be
                                  created there.
  --processed DIRECTORY           Directory where episodes are saved.
                                  [required]
  --jingles DIRECTORY             [required]
  --max-episodes INTEGER          The maximum number of episodes to process
                                  right now. Episodes are processed in order
                                  of publish date (newest first).
  --max-on-disk INTEGER           The maximum number of episodes to have saved
                                  in the processed directory. Episodes are
                                  prioritized by publish date (newest first).
  --feed-url TEXT                 URL of the podcast RSS feed.  [required]
  --https                         Whether to use https for the replaced links.
                                  If not provided, http will be used.
  --domain TEXT                   Domain where you will be hosting the files.
                                  Do not include the path.  [required]
  --dl-path-parent TEXT           Parent directory of the download directory
                                  to use in feed urls. This is the super-
                                  secret path.
  --run-interval INTEGER          How often to run the script in minutes. If
                                  not provided or value <0, the script will
                                  run once.
  --version                       Show the version and exit.
  --help                          Show this message and exit.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published