Skip to content

How to Mirror from a Bandersnatch, now we can use SIMPLE API and not xmlrpc #2098

@alankeyes

Description

@alankeyes

Please forgive me if this is an inappropriate way to open a support request; I am happy to move this to the correct forum if redirected.

I have created a bandersnatch mirror (1) with the banderx server (with an SSL certificate added) in a Rocky Linux 9 VM, and it works great. I am able to change my pip.conf to point to this mirror and install packages as normal. But what I would really like to do is bandersnatch mirror (2) this bandersnatch mirror (1).

In the /srv/pypi/web folder, I've created a symlink back to this folder and called it pypi, since that is apparently the path that a bandersnatch mirror looks for (based on earlier errors I encountered), and in another VM, I've installed bandersnatch and changed /etc/bandersnatch.conf to use master = https://<first VM's IP address>. The only other modifications to this file are the same filtering plugins I have in bandersnatch mirror 1. When I go to update the mirror (2), however, I get this error:

[root@yelena ~]# bandersnatch mirror --force-check
2025-12-02 14:48:23,578 INFO: No status file to move (/srv/pypi/status) - Full sync will occur (main.py:187)
2025-12-02 14:48:23,578 INFO: Selected storage backend: filesystem (configuration.py:131)
2025-12-02 14:48:23,578 INFO: Selected compare method: hash (configuration.py:179)
2025-12-02 14:48:23,592 INFO: considering /root/requirements.txt (allowlist_name.py:120)
2025-12-02 14:48:23,593 INFO: Initialized project plugin project_requirements, filtering ['alabaster'] (allowlist_name.py:33)
2025-12-02 14:48:23,611 INFO: Initialized exclude_platform plugin with ['.win32', '-win32', 'win_amd64', 'win-amd64', 'macosx_', 'macosx-', '.freebsd', '-freebsd'] (filename_name.py:116)
2025-12-02 14:48:23,615 INFO: Status file /srv/pypi/status missing. Starting over. (mirror.py:566)
2025-12-02 14:48:23,616 INFO: Syncing with https://10.29.99.96. (mirror.py:57)
2025-12-02 14:48:23,616 INFO: Current mirror serial: 0 (mirror.py:278)
2025-12-02 14:48:23,617 INFO: Syncing all packages. (mirror.py:293)
Traceback (most recent call last):
  File "/usr/local/bin/bandersnatch", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/bandersnatch/main.py", line 226, in main
    return asyncio.run(async_main(args, config))
  File "/usr/lib64/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib64/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.9/site-packages/bandersnatch/main.py", line 191, in async_main
    return await bandersnatch.mirror.mirror(config)
  File "/usr/local/lib/python3.9/site-packages/bandersnatch/mirror.py", line 986, in mirror
    changed_packages = await mirror.synchronize(
  File "/usr/local/lib/python3.9/site-packages/bandersnatch/mirror.py", line 65, in synchronize
    await self.determine_packages_to_sync()
  File "/usr/local/lib/python3.9/site-packages/bandersnatch/mirror.py", line 297, in determine_packages_to_sync
    all_packages = await self.master.all_packages()
  File "/usr/local/lib/python3.9/site-packages/bandersnatch/master.py", line 201, in all_packages
    all_packages_with_serial = await self.rpc("list_packages_with_serial")
  File "/usr/local/lib/python3.9/site-packages/bandersnatch/master.py", line 196, in rpc
    return await method()
  File "/usr/local/lib/python3.9/site-packages/aiohttp_xmlrpc/client.py", line 122, in __remote_call
    return self._parse_response((await response.read()), method_name)
  File "/usr/local/lib/python3.9/site-packages/aiohttp_xmlrpc/client.py", line 82, in _parse_response
    response = etree.fromstring(body, parser)
  File "src/lxml/etree.pyx", line 3428, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 2066, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1919, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1944, in lxml.etree._parseDoc_bytes
  File "src/lxml/parser.pxi", line 1194, in lxml.etree._BaseParser._parseDoc
  File "src/lxml/parser.pxi", line 647, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 765, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 689, in lxml.etree._raiseParseError
  File "<string>", line 12
lxml.etree.XMLSyntaxError: Opening and ending tag mismatch: hr line 12 and body, line 12, column 18

On further digging, I have been able to resolve that error by customizing the index.htm served by nginx to get rid of the mismatched hr tags, but then I get Invalid Body errors from lxml because the HTML doesn't conform to the XML schema expected by aiohttp, which I believe is xmlrpc.rng. This exhausts my knowledge of XML. I tried inspecting pypi.org to see if I could further match my index.htm to what they have there, since that is what bandersnatch mirror 1 is mirroring without errors, but I saw no obvious XML there.

So my questions are--is bandersnatching a bandersnatch something I should expect to work? Is this actually an easy fix and I just don't know it because I'm an embedded developer with no experience in web development? If so, what should I do?

Thank you so much for your patience!

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationBandersnatch Documentation RelatedenhancementNew feature or requesthelp wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions