When the script encounters a WARC that does not include a seed ID in the filename, the script exits abruptly. Example error output:
Unable to get seed ID for ARCHIVEIT-7963-MISSING_URLS_PATCH_CRAWL-JOB2615915-20251010182814580-00000-h3.warc.gz
Unable to make seed dictionary and cannot check for completeness.
I believe lines 96-101 in https://github.com/uga-libraries/web-aip/blob/main/web_functions.py are the culprit.
Expected behavior:
When a WARC without a seed ID is encountered it is added to an "other" group, so that it can still be downloaded and checked for completeness.