Skip to content

Conversation

@jaaaaavier
Copy link
Contributor

@jaaaaavier jaaaaavier commented Nov 24, 2025

Description

The file compressor had an issue with PDF compression where PDFs were only being compressed by 5% using basic pdf-lib optimization

After an investigation I found that we could improve easily the compressor by adding:
-Real Compression Logic: The compressPDF function only used useObjectStreams and addDefaultPage flags, which barely reduced file size
-A real Implementation: There was no dedicated PDF compression module, just a basic wrapper around pdf-lib's save function

Proposed Improve

  1. Content Analysis: Detect whether PDFs contain selectable text or are scanned images
  2. Dual Strategy Compression:
    • Text-based PDFs: Preserve document structure and text selectability
    • Image-based PDFs: Apply high-quality image compression with rendering

Implementation

Created a dedicated PDF compression module with multiple strategies:
1. Content Detection
- hasSelectableText(): Uses PDF.js to detect if PDF contains selectable text
- analyzePDFContent(): Analyzes PDF to determine optimal compression strategy
2. Compression Methods
1. Basic Optimization (for text-based PDFs):
- Removes metadata (title, author, subject, keywords, producer, creator)
- Enables object streams compression
- Preserves all text, links, and document structure
- Results in 5-15% compression while maintaining full functionality
2. Ultra Quality Rendering (for scanned PDFs):
- Renders each page to canvas at high resolution
- Supports PNG (lossless) or JPEG (high quality) output
- Implements 2x super-sampling for better anti-aliasing
- Uses high-quality image smoothing during rendering
- Results in 60-70% compression with good visual quality
3. Updated the main service to use the new module:

@jaaaaavier jaaaaavier requested a review from xabg2 as a code owner November 24, 2025 09:42
@vercel
Copy link

vercel bot commented Nov 24, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
website Ready Ready Preview Comment Nov 24, 2025 9:48am

@jaaaaavier jaaaaavier changed the title [MKT-652]:feat/File compressor tool improve [MKT-652]:feat/FileCompressor tool quality improve Nov 24, 2025
@jaaaaavier jaaaaavier marked this pull request as draft November 24, 2025 09:44
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@jaaaaavier jaaaaavier marked this pull request as ready for review November 24, 2025 10:04
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check Sonar issues

Copy link
Collaborator

@xabg2 xabg2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, consider creating a folder called compressor or something similar and add the files file-compressor and pdf-compressor to it. Also, rename the file pdfCompressor to match the pattern pdf-compressor.service.ts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants