Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ BUG ] Sitemap generation is so slow if I have large list of URLs (Problem in streamToPromise) #307

Closed
fr1sk opened this issue May 26, 2020 · 3 comments · Fixed by #308
Closed

Comments

@fr1sk
Copy link

fr1sk commented May 26, 2020

Describe the bug
I have around 20000 URLs that should go to the sitemap. I am using an example from your readme, just without createGzip. Here is how it looks like:

  async generateSitemapXML(data: SitemapUrlObject[]): Promise<Buffer> {
    return new Promise((resolve, reject) => {
      const smStream = new SitemapStream({
        hostname: process.env.FRONTEND_HOST
      });
      data.map(urlObject => {
        smStream.write(urlObject);
      });

      smStream.end();
      streamToPromise(smStream)
        .then(resolve)
        .catch(e => reject(e));      
    });
  }

When I have around 2000, 3000 URLs, it was working normally, but when I added more it was unacceptably slow. I started to investigate which part is causing the issue and realized that the problem is in streamToPromise function. Then I tried to replace your streamToPromise with stream-to-promise package, and everything was much faster.

This is the example, please check the response time, same data, just different streamToPromise:

image
response time using integrated streamToPromise

image
response time using third party streamToPromise

If you think this is the problem, I would be glad to submit PR and replace existing streamToPromise :)

Expected behavior
This should not happen, streapToPromise is a bottleneck for some reason.

Context:

  • Library Version 6.1.4
  • Typescript Version 3.7.5
  • Node Version 12.13.0

Additional context
I am using Nest framework

@derduher
Copy link
Collaborator

@fr1sk That's odd. I'll try to reproduce it using the performance tests in the repo. I'd be happy if you opened a PR. I'd also love to know what's causing the slowdown.

@derduher
Copy link
Collaborator

oof yeah this is 1000x as slow on my machine.

@derduher
Copy link
Collaborator

Ok I think I've got it fixed
========= streamToPromise =============
median: 943.0±46.4ms
99th percentile: 1056.5ms
median: 307.0±42.5mb
99th percentile: 307.0mb

========= stream =============
median: 789.0±43.6ms
99th percentile: 998.5ms
median: 79.0±2.0mb
99th percentile: 79.0mb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants