Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
* Add optional overrides for command-line arguments passed to `pdf2htmlEX`.
* Patch and build `pdf2htmlEX` as part of this build process to use `libopenjp` instead of `libjpeg` for JPEG-2000 support.
* All patches are in this source tree, and are applied to directly to the source of the upstream tag during build.
* Patch issue with non-breaking spaces in `pdf2HTMLEX`.
* Patch issue with non-breaking spaces and tab characters in `pdf2HTMLEX`.
* Convert complex SVGs images to bitmaps.

## 0.1.0
Expand Down
1 change: 1 addition & 0 deletions src/Pdf2Html/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ RUN patch ./buildScripts/versionEnvs ./patches/versionEnvs.patch
RUN patch ./buildScripts/buildPoppler ./patches/buildPoppler.patch
RUN patch ./buildScripts/getBuildToolsApt ./patches/getBuildToolsApt.patch
RUN patch ./buildScripts/getDevLibrariesApt ./patches/getDevLibrariesApt.patch
RUN patch ./pdf2htmlEX/src/util/unicode.cc ./patches/unicode.cc.patch
RUN patch ./pdf2htmlEX/src/util/unicode.h ./patches/unicode.h.patch
RUN patch ./pdf2htmlEX/CMakeLists.txt ./patches/CMakeLists.patch

Expand Down
18 changes: 18 additions & 0 deletions src/Pdf2Html/pdf2htmlEX/patches/unicode.cc.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
@@ -47,6 +47,8 @@ Unicode unicode_from_font (CharCode code, GfxFont * font)
if(cname)
{
Unicode ou = globalParams->mapNameToUnicodeText(cname);
+ if(ou == '\t')
+ return ' ';
if(!is_illegal_unicode(ou))
return ou;
}
@@ -62,6 +64,8 @@ Unicode check_unicode(Unicode const * u, int len, CharCode code, GfxFont * font)

if(len == 1)
{
+ if(*u == '\t')
+ return ' ';
if(!is_illegal_unicode(*u))
return *u;
}
9 changes: 9 additions & 0 deletions src/Pdf2Html/pdf2htmlEX/patches/unicode.h.patch
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
@@ -27,7 +27,7 @@ namespace pdf2htmlEX {
* 00(NUL)--09(\t)--0A(\n)--0D(\r)--20(SP)--7F(DEL)--9F(APC)--A0(NBSP)--AD(SHY)--061C(ALM)--1361(Ethiopic word space)
* webkit: [--------------------------------) [------------------) [-]
* moz: [--------------------------------) [---------] [-]
- * p2h: [--------------------------------) [------------------] [-] [-] [-]
+ * p2h: [--------------------------------) [------------------) [-] [-] [-]
*
* 200B(ZWSP)--200C(ZWNJ)--200D(ZWJ)--200E(LRM)--200F(RLM)--2028(LSEP)--2029(PSEP)--202A(LRE)--202E(RL0)--2066(LRI)--2069(PDI)
* webkit: [-----------------------------------------------] [----------]
@@ -39,9 +39,6 @@ namespace pdf2htmlEX {
* moz:
* p2h: [------------------] [-] [-] [-----------------]
Expand Down
4 changes: 2 additions & 2 deletions tests/E2E.Tests/Resources/CS_cheat_sheet.html
Git LFS file not shown
Loading