Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: take advantage of libvips 8.15.0 keep metadata feature #3824

Closed
3 tasks done
uhthomas opened this issue Oct 11, 2023 · 10 comments
Closed
3 tasks done

Enhancement: take advantage of libvips 8.15.0 keep metadata feature #3824

uhthomas opened this issue Oct 11, 2023 · 10 comments

Comments

@uhthomas
Copy link
Contributor

Possible bug

Is this a possible bug in a feature of sharp, unrelated to installation?

  • Running npm install sharp completes without error.
  • Running node -e "require('sharp')" completes without error.

If you cannot confirm both of these, please open an installation issue instead.

Are you using the latest version of sharp?

  • I am using the latest version of sharp as reported by npm view sharp dist-tags.latest.

If you cannot confirm this, please upgrade to the latest version and try again before opening an issue.

If you are using another package which depends on a version of sharp that is not the latest, please open an issue against that package instead.

What is the output of running npx envinfo --binaries --system --npmPackages=sharp --npmGlobalPackages=sharp?

What are the steps to reproduce?

We're generating thumbnails without metadata for security, but we want to apply an ICC profile to avoid dull colours. It does not seem possible to do this without first creating a thumbnail without any metadata, and then applying the ICC profile.

What is the expected behaviour?

Apply an ICC profile and strip all other metadata in one operation.

Please provide a minimal, standalone code sample, without other dependencies, that demonstrates this problem

immich-app/immich#4438

Please provide sample image(s) that help explain this problem

@lovell
Copy link
Owner

lovell commented Oct 11, 2023

The forthcoming libvips 8.15.x will invert the remove vs keep metadata logic via libvips/libvips#3585 and, as a result, I'm planning to revisit and rework the current withMetadata() API in sharp as part of integrating it.

I've not had the chance to create an issue for this yet, so if it's OK with you, let's use this to track it.

Note to self: see also #2910

@lovell lovell added this to the v0.33.0 milestone Oct 11, 2023
@etnoy
Copy link

etnoy commented Oct 19, 2023

Seems like libvips 8.15 is getting closer, rc1 was just released

@adriaanmeuris
Copy link
Contributor

I'm sharing my use case as it may be relevant to upcoming changes in withMetadata:

  1. we convert 2 jpg files to CMYK colorspace with sharp, using withMetadata in combination with icc: 'U.S. Web Coated (SWOP) v2'. The resulting files each contain the ICC profile in the metadata, adding about 500kb to each file:
    await sharp('input.jpg')
      .toColourspace('cmyk')
      .withMetadata({ icc: 'U.S. Web Coated (SWOP) v2.icc' })
      .toFile('output.jpg');
  2. we generate a PDF file containing the U.S. Web Coated (SWOP) v2 profile as a separate ICCBased ColorSpace stream
  3. we embed the JPEG images as XObjects with a DCTDecode filter, and link them to the separate ICCBased color space stream created in step 2.

This works fine, but the U.S. Web Coated (SWOP) v2 is embedded 3 times in the resulting PDF document, due to the profile being part of each image buffer in step 3.

As the PDF spec allows to separate ICC profiles from actual image data, we'd like to strip the metadata from the resulting JPG files in step 1 as it would greatly reduce the resulting file size of the PDF.

We currently do some postprocessing using exiftool, which does not re-encode the image data and simply strips out the profile we already have available as a separate stream in the PDF document:
exiftool -all= -icc_profile:all= -overwrite_original output.jpg

This greatly reduces our resulting PDF filesize without impacting image quality.

Based on the feedback provided in #3572, we can also achieve this with Sharp:

await sharp('output.jpg', { ignoreIcc: true })
  .pipelineColourspace('cmyk')
  .toColourspace('cmyk')
  .toFile('output-stripped.jpg');

However, compared to our implementation with exiftool, this requires a second decode/encode roundtrip.

If we're able to run step 1 without preserving the ICC profile in the resulting file's metadata, we can skip postprocessing these files as we have the profile already available as a separate stream in the PDF document.

@lovell
Copy link
Owner

lovell commented Nov 13, 2023

@adriaanmeuris I don't think the changes being discussed here are going to help with your PDF-specific and much less common scenario.

@lovell
Copy link
Owner

lovell commented Nov 13, 2023

I've given some thought to the possible API for this this and I believe it makes sense to separate the new "keep" logic into multiple functions, hopefully making the partial/complete replacement of metadata more explicit.

// API available from 0.33.0

keepExif()
withExif({ ... })
withExifMerge({ ... })

// API available from 0.33.0

keepIccProfile()
withIccProfile(filename: string | "srgb" | "p3")

// API available from 0.33.0

keepMetadata()

// API available from 0.33.0

The existing withMetadata() would then have its exif and icc properties deprecated, and their use would result in a call to the underlying withExifMerge() and withIccProfile() respectively.

This means the non-deprecated use of withMetadata() would reduce in scope to:

withMetadata({
  density: ... ,
  orientation: ...
})

Other things:

  • Calling withMetadata() without any parameters would be backwards compatible and the equivalent to keeping everything apart from icc, which would be set to "srgb".
  • The default behaviour, when none of these functions are used, remains the same and is the equivalent to keep=none.

@lovell lovell changed the title Cannot apply ICC profile without preserving all metadata Enhancement: take advantage of libvips 8.15.0 keep metadata feature Nov 13, 2023
@adriaanmeuris
Copy link
Contributor

@adriaanmeuris I don't think the changes being discussed here are going to help with your PDF-specific and much less common scenario.

Agree it's a much less common scenario, but stripping the profile is the default behaviour when converting without withMetadata:

The default behaviour, when withMetadata is not used, is to convert to the device-independent sRGB colour space and strip all metadata, including the removal of any ICC profile.

Which is perfectly fine, since web browsers assume that images without an ICC profile are in sRGB colourspace anyway - so stripping it saves several kb per image.

Being able to not embed the ICC profile in the resulting file when converting to a custom CMYK output profile would allow us to achieve the same thing, even though we don't target web browsers. With the keep flag available in all savers as of libvips 8.15, this is possible:

vips icc_transform input_cmyk_fogra39_with_icc.jpg "output_cmyk_swop_stripped.jpg[keep=none]" "U.S. Web Coated (SWOP) v2.icc"

Looking at your API proposal, would the following code match the icc_transform above?

await sharp('input_cmyk_fogra39_with_icc.jpg')
  .toColourspace('cmyk')
  .withIccProfile('U.S. Web Coated (SWOP) v2.icc')
  .keepMetadata(false)
  .toFile('output_cmyk_swop_stripped.jpg');

@lovell
Copy link
Owner

lovell commented Nov 15, 2023

@adriaanmeuris The proposed keepMetadata() function will be the equivalent to libvips keep=all and will not accept a boolean.

lovell added a commit that referenced this issue Nov 15, 2023
Add new withX and keepX functions that take advantage of
libvips 8.15.0 new 'keep' metadata feature.
lovell added a commit that referenced this issue Nov 16, 2023
Add new withX and keepX functions that take advantage of
libvips 8.15.0 new 'keep' metadata feature.
lovell added a commit that referenced this issue Nov 16, 2023
Add new withX and keepX functions that take advantage of
libvips 8.15.0 new 'keep' metadata feature.
@lovell
Copy link
Owner

lovell commented Nov 16, 2023

The change was a little bigger than I expected so I've opened a PR for this for anyone interested to take a look at #3856 - I'll probably merge this in the next day or two if there are no problems discovered.

@adriaanmeuris
Copy link
Contributor

@adriaanmeuris The proposed keepMetadata() function will be the equivalent to libvips keep=all and will not accept a boolean.

Apologies, and thank you for pointing this out.

I'll take a look at the PR #3856 and get back with any thoughts on it.

lovell added a commit that referenced this issue Nov 19, 2023
Add new withX and keepX functions that take advantage of
libvips 8.15.0 new 'keep' metadata feature.
lovell added a commit that referenced this issue Nov 21, 2023
Add new withX and keepX functions that take advantage of
libvips 8.15.0 new 'keep' metadata feature.
lovell added a commit that referenced this issue Nov 21, 2023
Add new withX and keepX functions that take advantage of
libvips 8.15.0 new 'keep' metadata feature.
@lovell
Copy link
Owner

lovell commented Nov 29, 2023

v0.33.0 is now available with the API outlined in #3824 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants