Skip to content

pdftools merge fails for some PDFs #14

@cjfp

Description

@cjfp

When I try to merge a PDF of a Virgin Mobile phone bill, it crashes on Windows 7 / Cygwin.

$ pdftools merge -o test.pdf virgin.pdf
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/PyPDF2/generic.py", line 229, in new
return decimal.Decimal.new(cls, utils.str_(value), context)
decimal.InvalidOperation: [<class 'decimal.ConversionSyntax'>]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/bin/pdftools", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.8/site-packages/pdftools/_cli.py", line 274, in main
pdf_merge(ARGS.src, ARGS.output, ARGS.delete)
File "/usr/local/lib/python3.8/site-packages/pdftools/pdftools.py", line 42, in pdf_merge
writer.write(outputfile)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 482, in write
self._sweepIndirectReferences(externalReferenceMap, self._root)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 556, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, data[i])
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 586, in _sweepIndirectReferences
newobj = self._sweepIndirectReferences(externMap, newobj)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 577, in _sweepIndirectReferences
newobj = data.pdf.getObject(data)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/pdf.py", line 1611, in getObject
retval = readObject(self.stream, self)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/generic.py", line 66, in readObject
return DictionaryObject.readFromStream(stream, pdf)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/generic.py", line 579, in readFromStream
value = readObject(stream, pdf)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/generic.py", line 92, in readObject
return NumberObject.readFromStream(stream)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/generic.py", line 271, in readFromStream
return FloatObject(num)
File "/usr/local/lib/python3.8/site-packages/PyPDF2/generic.py", line 231, in new
return decimal.Decimal.new(cls, str(value))
decimal.InvalidOperation: [<class 'decimal.ConversionSyntax'>]

$ pip list
Package Version


pdftools 2.0.2
pip 21.3.1
PyPDF2 1.26.0
setuptools 59.1.1

If I go into Adobe, optimize the PDF, and save to a new file, then there are no problems. Do you have any suggestions about how to handle this from the command line? I wish I had a PDF to send without tons of private information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions