Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Upgrading to latest version of puppeteer causing extra blank pages. #12442

Open
1 of 2 tasks
Mominadar opened this issue May 15, 2024 · 2 comments
Open
1 of 2 tasks
Labels
bug disable-analyzer Disables the automatic workflow that tries to reproduce bug reports invalid needs-feedback not-reproducible

Comments

@Mominadar
Copy link

Mominadar commented May 15, 2024

Minimal, reproducible example

const chromium = require("@sparticuz/chromium");
const PDF_PARSE = require("pdf-parse");
const imgToPDF = require('image-to-pdf');
const puppeteer = require("puppeteer-core");
const pdf2pic = require("pdf2pic");
const AWS = require("aws-sdk");
const AXIOS = require("axios");

module.exports.handleResponse = async (event, context) => {
  let browser = null;
  try {
    // TODO: pull out browser initialization from loop
    browser = await puppeteer.launch({
      headless: true,
      ignoreHTTPSErrors: true,
      timeout: 9000,
      args:chromium.args,
      defaultViewport: chromium.defaultViewport,
      executablePath: await chromium.executablePath(), 
    });
    
    // event can have multiple records
    for (let record of event.Records) {
     
      const body = JSON.parse(record.body);
      try {
        let message = null;
        let page = null;
        
        page = await browser.newPage();
        
        await page.setContent(html, { waitUntil: 'networkidle0' });
       
        let pdf = await page.pdf(config.getConfig(language));
        
        console.log("Language: " + language);
        const pdfInfo = await PDF_PARSE(pdf); //numpages
        if (pdfInfo.numpages > 4) { // each page is fillowed by a blank page so for 2 pages check for 4
          message = "generated pdf has more than 2 pages";
        } else {
          console.log("flattening image");
          try {
            
            const convert = pdf2pic.fromBuffer(pdf, {
              saveFilename: "untitled",
              savePath: "/tmp",
              width: 596*4, //Number in p
              height: 842*4, // Number in px
             density: 100,
            });
            const pngPages1 = await convert(1, { responseType: "base64" });
            const pngPages3 = await convert(3, { responseType: "base64" });
           
            const pages = [
                "data:image/png;base64,"+pngPages1.base64,
                "data:image/png;base64,"+pngPages3.base64
            ]
 
            pdf = imgToPDF(pages, imgToPDF.sizes.A4);
          }
          catch (err) {
            console.log(err);
            message = "failure in flattening pdf";
          }
        if (page !== null) {
          await page.close();
        }
      } catch (error) {
        console.log("error processing record: ", record, error);
      }
    } // for loop ends
  } catch (error) {
    console.log("error in function body: ", error);
    return context.fail(error);
  } finally {
    if (browser !== null) {
      await browser.close();
      console.log("browser closed");
    }
  }
};

Error string

No error but instead of getting 2 pages I get 4.

Bug behavior

  • Flaky
  • PDF

Background

I am using a lambda function to generate a pdf and convert it to an image. Previously the function used node 16 and is still working fine with that version. But needed to upgrade to version 20 as AWS is removing support for 16. In doing that upgraded dependencies accordingly. Now everything works but the pdf is parsed to be 4 pages long but it is really 2 pages. After each page there is a blank page. I worked around this by only using pages 1 and 3 but the issue is that the generated pdf shows page number to be 1/4 and 3/4.

The pdf-parse package when checking gets 4 pages so the issue is above this.

Working fine with pupeteer version 17.1.3 and chromium 106.0.2 node 16
not working with pupetter 22.6.0 chromium 123.0.1 node 20

Expectation

Generate 2 page pdf without any blank pages.

Reality

Get 4 pages. 2 actual pages each having a blank page after it.

Puppeteer configuration file (if used)

No response

Puppeteer version

22.6.0

Node version

20.13.1

Package manager

npm

Package manager version

8

Operating system

Windows

Copy link

github-actions bot commented May 15, 2024

This issue has an outdated Puppeteer version: 22.6.0. Please verify your issue on the latest 22.8.2 version. Then update the form accordingly.


Analyzer run

@github-actions github-actions bot added invalid and removed invalid labels May 15, 2024
@OrKoN
Copy link
Collaborator

OrKoN commented May 15, 2024

Please try to reproduce the issue without @sparticuz/chromium using the bundled browser and please provide a standalone reproducible script without Lambda including the test page.

@OrKoN OrKoN added needs-feedback not-reproducible disable-analyzer Disables the automatic workflow that tries to reproduce bug reports labels May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug disable-analyzer Disables the automatic workflow that tries to reproduce bug reports invalid needs-feedback not-reproducible
Projects
None yet
Development

No branches or pull requests

2 participants