Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# Changelog

## 0.3.0

* Switch to `cfl3` CFL patched fork instead of patching as part of this build.
* This improves support for certain font CMaps
* Remove `--tounicode` in favour of `--ignore-tounicode`, as `force` is no longer required.

## 0.2.2

* Patch memory corruption bug due to PNG background images being the incorrect size.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,4 @@ docker run corefiling/pdf2html pdf2htmlEX:$version --help

Since pdf2htmlex is licensed under the GPL, this project is too (see the LICENSE.TXT file).

As you can see from the build process, pdf2htmlEX itself is patched by the patches within this project (see [src/Pdf2Html/patches](tree/src/Pdf2Html/patches)), based on a clone of the upstream project tag we are targeting. As such we have not repeated pdf2htmlEX's source code here; you can find it via the link above.
As you can see from the build process, pdf2htmlEX itself is aquired from our fork of pdf2htmlEX/pdf2htmlEX found here: https://github.com/CoreFiling/pdf2htmlEX/tree/feature/cfl-patches
34 changes: 3 additions & 31 deletions src/Pdf2Html/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,40 +3,12 @@ FROM ubuntu:noble AS build-pdf2htmlex

# Produces a patched pdf2htmlEX using libopenjp 2.7 instead of libjpeg to get JPEG2000 support.

ENV PDF2HTMLEX_BRANCH=
ENV UNATTENDED="--assume-yes"
ENV MAKE_PARALLEL="-j 4"
ENV PDF2HTMLEX_PREFIX=/usr/local
ENV DEBIAN_FRONTEND=noninteractive

WORKDIR /source
RUN apt update && apt install -y git patch sudo
RUN git clone --depth=1 --branch v0.18.8.rc1 https://github.com/pdf2htmlEX/pdf2htmlEX
RUN git clone --depth=1 --branch 0.18.8.rc1-cfl3 https://github.com/CoreFiling/pdf2htmlEX
WORKDIR /source/pdf2htmlEX

COPY ./pdf2htmlEX/patches ./patches
RUN patch ./buildScripts/versionEnvs ./patches/versionEnvs.patch
RUN patch ./buildScripts/buildPoppler ./patches/buildPoppler.patch
RUN patch ./buildScripts/getBuildToolsApt ./patches/getBuildToolsApt.patch
RUN patch ./buildScripts/getDevLibrariesApt ./patches/getDevLibrariesApt.patch
RUN patch ./pdf2htmlEX/src/BackgroundRenderer/SplashBackgroundRenderer.cc ./patches/SplashBackgroundRenderer.cc.patch
RUN patch ./pdf2htmlEX/src/util/unicode.cc ./patches/unicode.cc.patch
RUN patch ./pdf2htmlEX/src/util/unicode.h ./patches/unicode.h.patch
RUN patch ./pdf2htmlEX/CMakeLists.txt ./patches/CMakeLists.patch

RUN ./buildScripts/versionEnvs
RUN ./buildScripts/reportEnvs
RUN ./buildScripts/getBuildToolsApt
RUN ./buildScripts/getDevLibrariesApt
RUN ./buildScripts/getPoppler
RUN patch ./poppler/glib/poppler-enums.c.template ./patches/poppler-enums.c.template.patch
RUN patch ./poppler/glib/poppler-private.h ./patches/poppler-private.h.patch
RUN ./buildScripts/buildPoppler
RUN ./buildScripts/getFontforge
RUN patch ./fontforge/fontforge/tottfgpos.c ./patches/fontforge-tottfgpos.c.patch
RUN ./buildScripts/buildFontforge
RUN ./buildScripts/buildPdf2htmlEX
RUN ./buildScripts/installPdf2htmlEX
RUN ./buildScripts/buildInstallLocallyApt
RUN git config user.name "CoreFiling"
RUN git config user.email "opensource@corefiling.com"
RUN ./buildScripts/createDebianPackage
Expand All @@ -51,7 +23,7 @@ RUN apt update && apt install -y wget
RUN wget http://archive.ubuntu.com/ubuntu/pool/main/libj/libjpeg-turbo/libjpeg-turbo8_2.0.3-0ubuntu1_amd64.deb
RUN apt install -y ./libjpeg-turbo8_2.0.3-0ubuntu1_amd64.deb
COPY --from=build-pdf2htmlex /source/pdf2htmlEX/imageBuild/*.deb /pdf2htmlEX/
RUN apt install -y libjpeg62 libopenjp2-7 /pdf2htmlEX/pdf2htmlEX-0.18.8.rc1-cfl2-*-x86_64.deb
RUN apt install -y libjpeg62 libopenjp2-7 /pdf2htmlEX/pdf2htmlEX-0.18.8.rc1-cfl3-*-x86_64.deb

WORKDIR /app
COPY --from=build /app ./
Expand Down
2 changes: 1 addition & 1 deletion src/Pdf2Html/Pdf2Html.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<TargetFramework>net8.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<Version>0.2.2</Version>
<Version>0.3.0</Version>
<AssemblyName>Pdf2Html</AssemblyName>
<RootNamespace>Pdf2Html</RootNamespace>
</PropertyGroup>
Expand Down
3 changes: 1 addition & 2 deletions src/Pdf2Html/appsettings.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@
"Printing": false,
"BgFormat": "svg",
"SvgNodeCountLimit": 100,
"DecomposeLigature": true,
"Tounicode": true
"DecomposeLigature": true
},
"Logging": {
"LogLevel": {
Expand Down
19 changes: 0 additions & 19 deletions src/Pdf2Html/pdf2htmlEX/patches/CMakeLists.patch

This file was deleted.

20 changes: 0 additions & 20 deletions src/Pdf2Html/pdf2htmlEX/patches/SplashBackgroundRenderer.cc.patch

This file was deleted.

9 changes: 0 additions & 9 deletions src/Pdf2Html/pdf2htmlEX/patches/buildPoppler.patch

This file was deleted.

13 changes: 0 additions & 13 deletions src/Pdf2Html/pdf2htmlEX/patches/fontforge-tottfgpos.c.patch

This file was deleted.

7 changes: 0 additions & 7 deletions src/Pdf2Html/pdf2htmlEX/patches/getBuildToolsApt.patch

This file was deleted.

5 changes: 0 additions & 5 deletions src/Pdf2Html/pdf2htmlEX/patches/getDevLibrariesApt.patch

This file was deleted.

18 changes: 0 additions & 18 deletions src/Pdf2Html/pdf2htmlEX/patches/poppler-enums.c.template.patch

This file was deleted.

15 changes: 0 additions & 15 deletions src/Pdf2Html/pdf2htmlEX/patches/poppler-private.h.patch

This file was deleted.

18 changes: 0 additions & 18 deletions src/Pdf2Html/pdf2htmlEX/patches/unicode.cc.patch

This file was deleted.

28 changes: 0 additions & 28 deletions src/Pdf2Html/pdf2htmlEX/patches/unicode.h.patch

This file was deleted.

9 changes: 0 additions & 9 deletions src/Pdf2Html/pdf2htmlEX/patches/versionEnvs.patch

This file was deleted.