Skip to content

feat: support TIFF images in DOCX rendering#2193

Closed
gpardhivvarma wants to merge 11 commits intosuperdoc-dev:mainfrom
gpardhivvarma:feat/tiff-image-support
Closed

feat: support TIFF images in DOCX rendering#2193
gpardhivvarma wants to merge 11 commits intosuperdoc-dev:mainfrom
gpardhivvarma:feat/tiff-image-support

Conversation

@gpardhivvarma
Copy link
Contributor

Summary

  • Converts TIFF images to PNG at import time using utif2, following the existing EMF/WMF → SVG conversion pattern
  • Adds tif to DocxZipper image extension sets so .tif files are properly extracted as data URIs
  • Gracefully falls back to "Unable to render image" alt text if conversion fails (no canvas, invalid data, etc.)
  • Round-trip export preserves original TIFF binary via originalSrc/originalExtension attributes

Closes #2064

Test plan

  • Unit tests for isTiffExtension (tiff/tif/TIFF/other/null/undefined)
  • Unit tests for convertTiffToPng (invalid data, null, empty, non-TIFF base64)
  • All 689 existing test files pass (6647 tests)
  • Manual test: open a DOCX with TIFF images in dev server — footer TIFF should render as PNG
  • Export round-trip: open DOCX, export, re-open — image renders and original TIFF preserved in zip

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b6335817b6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@gpardhivvarma
Copy link
Contributor Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5f0832d14d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@gpardhivvarma
Copy link
Contributor Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1c061d636b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@gpardhivvarma
Copy link
Contributor Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

caio-pizzol

This comment was marked as duplicate.

@caio-pizzol caio-pizzol self-requested a review February 27, 2026 11:07
Copy link
Contributor

@caio-pizzol caio-pizzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work - clean structure, follows the metafile converter pattern well, DoS guard is in the right place.

added a few comments

Copy link
Contributor

@caio-pizzol caio-pizzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gpardhivvarma all three comments from last round are resolved, and the overall approach looks good. left two small inline suggestions — nothing blocking, just minor cleanup.

@gpardhivvarma
Copy link
Contributor Author

@caio-pizzol made the requested additions please take a look at them

Copy link
Contributor

@caio-pizzol caio-pizzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gpardhivvarma all the feedback from the previous rounds is addressed — thanks for working through it. the code looks clean, left one inline suggestion about an integration test you could add following an existing pattern. not blocking. we'll handle uploading a TIFF doc for visual regression testing on our end since that requires R2 credentials.


Integration test suggestion

there's an existing pattern in tests/behavior/tests/importing/load-doc-with-pict.spec.ts that loads a .docx, waits for stable, and asserts the editor is functional — a similar test for a TIFF-containing .docx would verify the full pipeline end-to-end (DocxZipper → converter → convertTiffToPng → rendered img). would you be able to create a simple .docx with a TIFF image and follow the same pattern?

@gpardhivvarma
Copy link
Contributor Author

@gpardhivvarma all the feedback from the previous rounds is addressed — thanks for working through it. the code looks clean, left one inline suggestion about an integration test you could add following an existing pattern. not blocking. we'll handle uploading a TIFF doc for visual regression testing on our end since that requires R2 credentials.

Integration test suggestion

there's an existing pattern in tests/behavior/tests/importing/load-doc-with-pict.spec.ts that loads a .docx, waits for stable, and asserts the editor is functional — a similar test for a TIFF-containing .docx would verify the full pipeline end-to-end (DocxZipper → converter → convertTiffToPng → rendered img). would you be able to create a simple .docx with a TIFF image and follow the same pattern?

Sure would be happy to help, will update once done

@gpardhivvarma
Copy link
Contributor Author

@caio-pizzol please take a look

Copy link
Contributor

@caio-pizzol caio-pizzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gpardhivvarma solid work on this — the converter follows the existing metafile-converter pattern nicely, the pixel limit guard is well-placed, and the new behavior test with a real fixture is a good addition.

one minor suggestion: the imageTypes/imageExts and mimeTypeForExt mappings in DocxZipper.js are duplicated across two methods — could be extracted to module-level constants. not blocking, just a cleanup for later.

lgtm, approving.

@caio-pizzol caio-pizzol requested a review from harbournick March 1, 2026 10:34
@caio-pizzol
Copy link
Contributor

@gpardhivvarma

I noticed utif2 is unmaintained (last publish May 2023, open bugs for blank output and visual artifacts on some files). works fine for typical DOCX TIFFs since it covers LZW/JPEG/CCITT, but if we hit rendering bugs down the road, https://github.com/image-js/tiff is the actively maintained alternative — only gap is no CCITT support.

maybe worth refactoring?

@caio-pizzol caio-pizzol self-requested a review March 1, 2026 10:46
@gpardhivvarma
Copy link
Contributor Author

@gpardhivvarma

I noticed utif2 is unmaintained (last publish May 2023, open bugs for blank output and visual artifacts on some files). works fine for typical DOCX TIFFs since it covers LZW/JPEG/CCITT, but if we hit rendering bugs down the road, https://github.com/image-js/tiff is the actively maintained alternative — only gap is no CCITT support.

maybe worth refactoring?

Yeah sounds good will work on the refactor and update the PR

Copy link
Contributor

@caio-pizzol caio-pizzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gpardhivvarma the utif2tiff swap is clean and all prior feedback is addressed.

two new things from this round: decode() loads all pixel data up front, so the size guard runs too late — left a suggestion using { ignoreImageData: true } to check dimensions first. also, 16-bit TIFFs would get wrong colors because of a type mismatch — left a snippet for that too.

on tests: three small gaps — TIFF failure fallback in handleImageNode, the greyscale/alpha branches in toRGBA, and .tif MIME mapping in DocxZipper. not blocking, just nice to have.

@gpardhivvarma gpardhivvarma force-pushed the feat/tiff-image-support branch from 643836a to ee33535 Compare March 2, 2026 05:53
TIFF images in DOCX files rendered as broken icons because browsers
cannot natively display image/tiff. Convert TIFF to PNG at import time
using utif2, following the existing EMF/WMF → SVG conversion pattern.

Closes superdoc-dev#2064
Reject TIFF images exceeding 100M pixels before allocating RGBA buffers
or canvas, preventing a malicious TIFF with extreme dimensions from
freezing or crashing the tab during import.
Move MAX_PIXEL_COUNT check before UTIF.decodeImage/toRGBA8 so oversized
TIFFs are rejected before allocating the RGBA buffer.

Map .tif extension to image/tiff in Content_Types.xml generation to
avoid emitting the invalid MIME type image/tif.
UTIF.decode populates raw tag entries (t256/t257) but .width/.height
are only set after decodeImage. Read from raw tags so the pixel limit
guard works before the expensive decode step without rejecting valid
files.
- Use mimeTypeForExt mapping for .tif data URIs (image/tiff not image/tif)
- Remove unused size arg from convertTiffToPng call
- Add happy-path test asserting valid TIFF produces PNG data URI
- Remove unused Uint8Array/ArrayBufferView branches (only strings are passed)
- Add handleImageNode test verifying convertTiffToPng is called for .tif files
Adds a Playwright test that loads a minimal DOCX containing a TIFF image
and verifies the full pipeline: DocxZipper → convertTiffToPng → rendered PNG.
… constants

Replace unmaintained utif2 with actively maintained image-js/tiff for
TIFF decoding. Extract duplicated IMAGE_EXTS and MIME_TYPE_FOR_EXT
mappings in DocxZipper.js to module-level constants.
Use decode(buffer, { ignoreImageData: true }) to check dimensions
before allocating pixel data, preventing DoS from small compressed
TIFFs with huge dimensions. Normalize Uint16Array and Float32Array
pixel data to 8-bit for canvas compatibility.
@gpardhivvarma gpardhivvarma force-pushed the feat/tiff-image-support branch from ee33535 to 10d064e Compare March 2, 2026 06:33
@gpardhivvarma
Copy link
Contributor Author

@caio-pizzol

Copy link
Contributor

@caio-pizzol caio-pizzol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gpardhivvarma the ignoreImageData guard and bit-depth normalization both look correct — nice work.

main thing: the tiftiff MIME fix has no test yet, and it's the one place a regression could silently break export. left a few inline comments, nothing blocking.

…red dataUriToArrayBuffer helper

Address remaining PR review feedback: add tests for .tif → image/tiff MIME mapping
(import data URI and export Content_Types), TIFF conversion failure fallback alt text,
greyscale/grey+alpha/Uint16/Float32 toRGBA branches, and extract duplicate data-URI-stripping
logic from metafile-converter and tiff-converter into shared dataUriToArrayBuffer in helpers.js.
@gpardhivvarma
Copy link
Contributor Author

@caio-pizzol @harbournick Please mind taking a look.

@caio-pizzol
Copy link
Contributor

Closing in favor of a new PR from main repo branch — all original commits preserved. Thank you @gpardhivvarma for the implementation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Support TIFF images in DOCX rendering

2 participants