Skip to content

feat: support TIFF images in DOCX rendering#2284

Merged
caio-pizzol merged 12 commits intomainfrom
feat/tiff-image-support
Mar 4, 2026
Merged

feat: support TIFF images in DOCX rendering#2284
caio-pizzol merged 12 commits intomainfrom
feat/tiff-image-support

Conversation

@caio-pizzol
Copy link
Contributor

@caio-pizzol caio-pizzol commented Mar 4, 2026

Summary

  • Converts TIFF images to PNG at import time using utif2, since browsers cannot render TIFF natively
  • Supports all TIFF compression types: PackBits, LZW, Deflate, JPEG, CCITT Group 3/4, and uncompressed
  • Handles all color modes: RGB, RGBA, grayscale, 16-bit, bilevel, palette, CMYK
  • Pre-decode dimension validation (100M pixel limit) to prevent DoS from malicious TIFF dimensions
  • Round-trip export preserves original TIFF via originalSrc/originalExtension attributes
  • Correct .tifimage/tiff MIME mapping in DocxZipper
  • Shared dataUriToArrayBuffer helper extracted to helpers.js

Test plan

  • Unit tests for tiff-converter.js (14 tests: decode pipeline, dimension guard, edge cases)
  • Unit tests for DocxZipper.js (TIFF MIME mapping)
  • Unit tests for encode-image-node-helpers.js (TIFF routing and fallback)
  • Behavior test: load-doc-with-tiff.spec.ts (E2E import + PNG verification)
  • Manual testing with 24 TIFF variants (all compression types and color modes)

Based on the original implementation by @gpardhivvarma in #2193.

@caio-pizzol caio-pizzol self-assigned this Mar 4, 2026
@caio-pizzol caio-pizzol requested review from VladaHarbour and harbournick and removed request for harbournick March 4, 2026 13:29
@caio-pizzol caio-pizzol linked an issue Mar 4, 2026 that may be closed by this pull request
@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

Status: PASS

The spec-relevant parts of this PR are clean. The core OOXML concern here is the [Content_Types].xml mapping for .tif files — ECMA-376 §15.2.14 (Image Part) explicitly lists image/tiff as a recognized content type, and the PR correctly maps the .tif extension to ContentType="image/tiff" rather than the incorrect image/tif. That's exactly what the spec requires.

The rest of the changes are runtime behavior — TIFF→PNG conversion in the browser, refactoring base64ToArrayBuffer into a shared dataUriToArrayBuffer helper, and updating test mocks to use importOriginal() so newly-exported helpers aren't silently swallowed. None of that touches OOXML element or attribute semantics.

A few non-spec observations worth mentioning:

  • The [Content_Types].xml Default element for .tif will have Extension="tif" with ContentType="image/tiff". This is correct — the extension and the MIME subtype are intentionally different, and the PR handles that distinction properly.
  • The pixel-count guard (MAX_PIXEL_COUNT = 100_000_000) in tiff-converter.js is a good defensive practice given that TIFF dimensions are declared in the IFD before pixel data is decoded.
  • The utif2 dependency decodes IFD tags as t256/t257 (decimal tag numbers) — those correspond to TIFF tags 0x0100 (ImageWidth) and 0x0101 (ImageLength), which are correct.

gpardhivvarma and others added 12 commits March 4, 2026 10:56
TIFF images in DOCX files rendered as broken icons because browsers
cannot natively display image/tiff. Convert TIFF to PNG at import time
using utif2, following the existing EMF/WMF → SVG conversion pattern.

Closes #2064
Reject TIFF images exceeding 100M pixels before allocating RGBA buffers
or canvas, preventing a malicious TIFF with extreme dimensions from
freezing or crashing the tab during import.
Move MAX_PIXEL_COUNT check before UTIF.decodeImage/toRGBA8 so oversized
TIFFs are rejected before allocating the RGBA buffer.

Map .tif extension to image/tiff in Content_Types.xml generation to
avoid emitting the invalid MIME type image/tif.
UTIF.decode populates raw tag entries (t256/t257) but .width/.height
are only set after decodeImage. Read from raw tags so the pixel limit
guard works before the expensive decode step without rejecting valid
files.
- Use mimeTypeForExt mapping for .tif data URIs (image/tiff not image/tif)
- Remove unused size arg from convertTiffToPng call
- Add happy-path test asserting valid TIFF produces PNG data URI
- Remove unused Uint8Array/ArrayBufferView branches (only strings are passed)
- Add handleImageNode test verifying convertTiffToPng is called for .tif files
Adds a Playwright test that loads a minimal DOCX containing a TIFF image
and verifies the full pipeline: DocxZipper → convertTiffToPng → rendered PNG.
… constants

Replace unmaintained utif2 with actively maintained image-js/tiff for
TIFF decoding. Extract duplicated IMAGE_EXTS and MIME_TYPE_FOR_EXT
mappings in DocxZipper.js to module-level constants.
Use decode(buffer, { ignoreImageData: true }) to check dimensions
before allocating pixel data, preventing DoS from small compressed
TIFFs with huge dimensions. Normalize Uint16Array and Float32Array
pixel data to 8-bit for canvas compatibility.
…red dataUriToArrayBuffer helper

Address remaining PR review feedback: add tests for .tif → image/tiff MIME mapping
(import data URI and export Content_Types), TIFF conversion failure fallback alt text,
greyscale/grey+alpha/Uint16/Float32 toRGBA branches, and extract duplicate data-URI-stripping
logic from metafile-converter and tiff-converter into shared dataUriToArrayBuffer in helpers.js.
image-js/tiff lacks support for PackBits, JPEG, and CCITT compression
formats commonly found in Word documents. utif2 handles all TIFF
compression types via its toRGBA8 pipeline. Updated tests to match
utif2 API (decode → decodeImage → toRGBA8).
- createCanvas() now checks domEnvironment first, fixing silent failures
  in JSDOM environments where global document lacks canvas support
- Add dataUriToArrayBuffer unit tests covering all input branches and
  both throw paths
- Add explanatory comment for query-string module re-imports in tests
@caio-pizzol caio-pizzol force-pushed the feat/tiff-image-support branch from 419d3bf to 1f0dcc5 Compare March 4, 2026 13:56
@caio-pizzol caio-pizzol merged commit 6436d86 into main Mar 4, 2026
10 checks passed
@caio-pizzol caio-pizzol deleted the feat/tiff-image-support branch March 4, 2026 14:31
@superdoc-bot
Copy link
Contributor

superdoc-bot bot commented Mar 4, 2026

🎉 This PR is included in superdoc v1.18.0-next.7

The release is available on GitHub release

@superdoc-bot
Copy link
Contributor

superdoc-bot bot commented Mar 4, 2026

🎉 This PR is included in superdoc-cli v0.2.0-next.78

The release is available on GitHub release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Support TIFF images in DOCX rendering

2 participants