Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

synchronous terser.minify - execSync workaround #801

Closed
milahu opened this issue Aug 23, 2020 · 13 comments
Closed

synchronous terser.minify - execSync workaround #801

milahu opened this issue Aug 23, 2020 · 13 comments
Labels
good first issue This issue is good for contributors who are getting started

Comments

@milahu
Copy link

milahu commented Aug 23, 2020

im writing an AST transformer
and i need a synchronous version of minify
to minify small parts of the code

uglify-js is a sync function, but fails to parse ES6 code
so i made this "brutal but effective" workaround
to call terser.minify as a synchronous function

i put the async minify function in a separate node script
and call that script with child_process.execSync

// minify-es6-sync.js
// synchronous minify of es6 javascript
// license CC-0 + no warranty

if (false) { // sample use

  const child_process = require('child_process');
  const inn = '(()=>1)();';
  let out = '';
  try {
    out = child_process.execSync(
      "node minify-es6-sync.js", {
      input: inn,
      encoding: 'utf-8',
      //timeout: Infinity,
      maxBuffer: Infinity,
      windowsHide: true, // windows os
    }); 
  }
  catch (error) { // timeout, exit(1), exit(2)
    console.log('error in minify:\n'+error.stdout);
    throw new Error("minify failed");
  }
  // success
  console.log(out);
}

const terser = require('terser');

const terser_config = {
  sourceMap: false,
  ecma: 2020,
  compress: false,
  mangle: false,
  output: {
    beautify: false, // uglify
  },
};

const cl = 100; // context length for error printing

var inn = '';

process.stdin.on('readable', () => {
  let chunk;
  while (null !== (chunk = process.stdin.read())) {
    inn += chunk;
  }
});

process.stdin.on('end', async () => {
  try {
    // async terser.minify
    const minify_res = await terser.minify(inn, terser_config);

    if (!minify_res.error) {
      process.stdout.write(minify_res.code); // no newline
      process.stdout.on('drain', ()=>{
        // write is done
        process.exit(0); // success
      });
    }

    console.log('error:'); console.dir(minify_res.error);
    const ep = minify_res.error.pos;
    console.log('parse error context +-'+cl+': '+inn.substring(ep-cl, ep+cl));
    console.log('parse error raised at: '+inn.substring(ep, ep+cl));
    process.stdout.on('drain', ()=>{
      // write is done
      process.exit(1); // error 1
    });
  }
  catch (error) {
    console.log('error:'); console.dir(error);
    process.stdout.on('drain', ()=>{
      // write is done
      process.exit(2); // error 2
    });
  }
});

the need for a sync minify is probably too rare
to motivate "someone" to write a terser.minifySync function
so i leave this workaround here for public amusement : D

edit: bugfix for data over 250 KByte (write is async, we must wait for drain event)

milahu added a commit to milahu/random that referenced this issue Aug 23, 2020
@fabiosantoscode
Copy link
Collaborator

This reminds me of unholy magic I did to perform HTTP requests synchronously back in the day :)

Today this package was added as a note to the readme. It's also a hack but it's implemented at the binary level.

@milahu
Copy link
Author

milahu commented Sep 3, 2020

nice!

my subprocess solution is also used by the sync-rpc library

// sync-rpc/lib/index.js
const spawnSync = require('child_process').spawnSync;
// ....
    return spawnSync(process.execPath, [], {
      input: src,
      windowsHide: true,
      maxBuffer: Infinity,
    });

.... but the sync-rpc interface is prettier

// terser-minify-sync-rpc-worker.js
const terser = require('terser');

module.exports = function init(connection) {
  //return async function minifySync(files, options) {
  return async function minifySync([files, options]) {
    try {
      return await terser.minify(files, options);
    }
    catch (error) {
      return {error};
    }
  };
}
// app.js
// npm i sync-rpc
const syncRpc = require('sync-rpc');
const minifySync = syncRpc(
  __dirname+'/terser-minify-sync-rpc-worker.js');

const input = 'console.log("hello");';
const terser_config = {
  sourceMap: false, ecma: 2020,
  compress: false, mangle: false,
  output: { beautify: false }, // uglify
};
let res = {};

try {
  //res = minifySync(input, terser_config);
  res = minifySync([input, terser_config]);
}
catch (error) {
  console.log('error'); console.dir(error);
}

console.log('success'); console.dir(res);

edit: there is a bug in sync-rpc:
only the first argument is passed to the function
so terser_config is ignored
workaround: pass array

@milahu
Copy link
Author

milahu commented Sep 3, 2020

Today this package was added as a note to the readme. It's also a hack but it's implemented at the binary level.

nice try, but deasync does not work with terser.minify

deasync has no support for await-ing promises
author says:

I am ok for the ts but reluctant to add promise and await because they conceptually conflict with deasync.

when i do this, like in abbr/deasync/pull/99 ....

// not working

let keepWaiting = true;
let minifyRes = null;

terser.minify(minifyInput, minifyConfig)
.then(
  (value) => {
    // this code is never reached
    keepWaiting = false;
    minifyRes = value;
    console.log('resolved: '+value.code);
  },
  (reason) => {
    // this code is never reached
    keepWaiting = false;
    console.log('rejected: '+reason);
  },
);
console.log('promise started');

// wait for promise
// this will loop forever
deasync.loopWhile(() => {
  return keepWaiting;
});

// this code is never reached
console.log('promise done');
console.log('minifyRes:');
console.dir(minifyRes);

.... then my promise is never finished = stays pending forever

working solutions:

@milahu milahu mentioned this issue Sep 5, 2020
@louh
Copy link

louh commented Sep 6, 2020

Just wanted to leave one use case of synchronous minify.

In the 11ty static site generator if you had a synchronous minify you could write a filter in any template language to minify JS.

The only template language that supports asynchronous filters is Nunjucks and that's exactly what they did in this tutorial.

However, Nunjucks has a limitation where you cannot do anything asynchronous inside a macro. So being able to reuse code will be broken.

@fabiosantoscode
Copy link
Collaborator

@milahu those are interesting alternatives. I'm going to change the readme before more people bump into this issue.

Also I'm aware that terser being async breaks many use cases but it was a necessity because of source-map.

@milahu
Copy link
Author

milahu commented Nov 10, 2023

it was a necessity because of source-map.

not really...
with @jridgewell/source-map everything is sync

why the need to support mozilla/source-map?
because "its faster with WASM"?
at least the sourcemap-encoder would be about 2x faster

@contrast/synchronous-source-maps

The Mozilla official source-map library moved to being asynchronous after its 0.60 release in order to support WASM source maps.

mozilla/source-map#331 (comment)

Fetching and compiling WebAssembly is inherently async.

We could add a sync API to create a SourceMapConsumer given that you've already fetched and compiled the wasm.

mozilla/source-map#331 (comment)

A better API is for the SourceMapConsumer constructor to be synchronous, and for it to throw an error if the user has not first initialized the wasm module in environments that require async initialization.

so if people really want to use terser.minify with a WASM-sourcemap-encoder
then they should use a better implementation than mozilla/source-map
with an async init function to load the WASM module
and with a sync SourceMapConsumer constructor

import { minify } from "terser-with-wasm-sourcemap-codec";
import { initWasmSourcemapCodec } from "wasm-sourcemap-codec";

async function main() {
  // only this is async
  await initWasmSourcemapCodec();
  // everything else is sync
  var code = "function add(first, second) { return first + second; }";
  var result = minify(code, { sourceMap: true });
  console.log(result.code);  // minified output: function add(n,d){return n+d}
  console.log(result.map);  // source map
}

main();
diff: synchronous function minify
commit 3c2bf3381d5683b5c00d8e7b97f6b222c9846462
Author: Milan Hauth <milahu@gmail.com>
Date:   Fri Nov 10 12:43:47 2023 +0100

    synchronous function minify

diff --git a/lib/minify.js b/lib/minify.js
index 6b40569..fa64f9d 100644
--- a/lib/minify.js
+++ b/lib/minify.js
@@ -103,7 +103,7 @@ function log_input(files, options, fs, debug_folder) {
     fs.writeFileSync(log_path, "Options: \n" + options_str + "\n\nInput files:\n\n" + files_str(files) + "\n");
 }
 
-async function minify(files, options, _fs_module) {
+function minify(files, options, _fs_module) {
     if (
         _fs_module
         && typeof process === "object"
@@ -314,7 +314,7 @@ async function minify(files, options, _fs_module) {
             if (options.sourceMap.includeSources && files instanceof AST_Toplevel) {
                 throw new Error("original source content unavailable");
             }
-            format_options.source_map = await SourceMap({
+            format_options.source_map = SourceMap({
                 file: options.sourceMap.filename,
                 orig: options.sourceMap.content,
                 root: options.sourceMap.root,
diff --git a/lib/sourcemap.js b/lib/sourcemap.js
index f376ccc..f486eb4 100644
--- a/lib/sourcemap.js
+++ b/lib/sourcemap.js
@@ -47,7 +47,7 @@ import {SourceMapConsumer, SourceMapGenerator} from "@jridgewell/source-map";
 import {defaults, HOP} from "./utils/index.js";
 
 // a small wrapper around source-map and @jridgewell/source-map
-async function SourceMap(options) {
+function SourceMap(options) {
     options = defaults(options, {
         file : null,
         root : null,
@@ -67,10 +67,16 @@ async function SourceMap(options) {
         sourcesContent[name] = files[name];
     }
     if (options.orig) {
-        // We support both @jridgewell/source-map (which has a sync
-        // SourceMapConsumer) and source-map (which has an async
-        // SourceMapConsumer).
-        orig_map = await new SourceMapConsumer(options.orig);
+        // We support only @jridgewell/source-map
+        // (which has a sync SourceMapConsumer).
+        // We do not support source-map
+        // (which has an async SourceMapConsumer).
+        // If you want to use a WASM-based SourceMapConsumer
+        // then please find an implementation
+        // with an async init function to load the WASM module
+        // and a sync SourceMapConsumer constructor.
+        // https://github.com/mozilla/source-map/issues/331#issuecomment-883044938
+        orig_map = new SourceMapConsumer(options.orig);
         if (orig_map.sourcesContent) {
             orig_map.sources.forEach(function(source, i) {
                 var content = orig_map.sourcesContent[i];
source-map-support should be in devDependencies

package.json

  "dependencies": {
    "@jridgewell/source-map": "^0.3.3",
    "acorn": "^8.8.2",
    "commander": "^2.20.0",
    "source-map-support": "~0.5.20"
  },

source-map-support is used only in test/compress.js
so it should be in devDependencies

@fabiosantoscode
Copy link
Collaborator

Indeed after migrating to @jridgewell/source-map the situation changed.

But after making all the direct and indirect dependencies of Terser change to async, I think it would be pretty bad to make them change back yet again.

But since that migration has been made, the case for a non-breaking addition of minify_sync (that would refuse to use an async sourcemap library, and any future async features) is really strong. I'm reopening.

@fabiosantoscode fabiosantoscode added the good first issue This issue is good for contributors who are getting started label Nov 10, 2023
@fabiosantoscode
Copy link
Collaborator

after making all the direct and indirect dependencies of Terser change to async

For what it's worth, I'm sorry about this. At the time there was a flaw with the Mozilla sourcemap library that required this breaking upgrade, and I couldn't know the future, but in the end it was proven to be a waste of time for a lot of people, for which I feel responsible.

@milahu
Copy link
Author

milahu commented Nov 10, 2023

after making all the direct and indirect dependencies of Terser change to async, I think it would be pretty bad to make them change back yet again.

migrating from async to sync is trivial, because await is a noop on sync functions

> (function f() { return 1; })();
1

> await (function f() { return 1; })();
1

after making all the direct and indirect dependencies of Terser change to async

no one is forced to use terser
everyone has the freedom to fork terser and remove the async code
everyone has the freedom to wrap the async code (sync-rpc, etc)
(similar situation with mozilla/source-map)

At the time there was a flaw with the Mozilla sourcemap library that required this breaking upgrade

maybe mozilla/source-map#308 and mozilla/source-map#370

a non-breaking addition of minify_sync (that would refuse to use an async sourcemap library, and any future async features)

sounds like much complexity for little gain

what "future async features" would that be?
async makes sense for tasks with low CPU load, like network transfers
in the case of mozilla/source-map,
the async part is really just the initialization of the WASM module ("await import")

i would "break" the API, increment the major version, and call it a day ; )

@fabiosantoscode
Copy link
Collaborator

You can't call .then on the return of a sync function!

So depending on the integration, it can break.

Increasing the major could be a good option too. I'll think about it.

@milahu
Copy link
Author

milahu commented Dec 28, 2023

thanks : )

interesting solution to mix sync and async code

        if (yielded && typeof yielded.then === "function") {
            throw new Error("minify_sync cannot be used with the legacy source-map module");
        }

@fabiosantoscode
Copy link
Collaborator

I can't decide if it's cursed or elegant

@milahu
Copy link
Author

milahu commented Dec 28, 2023

your solution is more generic than replacing async/await with Promise/then

in this case it makes no difference, because there is only one "maybe_promise" in terser

const child_process = require("child_process");

const sleep = ms => new Promise(r => setTimeout(r, ms));

async function inner_async() {
    console.log('wait... ' + new Date);
    await sleep(1000);
    console.log('ok... ' + new Date);
    return 42;
}

function inner_sync() {
    console.log('wait... ' + new Date);
    child_process.execSync("sleep 1");
    console.log('ok... ' + new Date);
    return 42;
}

const is_promise = x => (typeof x?.then == "function");

function main_inner(result) {
    return result + 1;
}

// inner can be sync or async function
function main_sync_async(inner) {
    const maybe_promise = inner();
    if (!is_promise(maybe_promise)) {
        // sync: return value
        return main_inner(maybe_promise);
    }
    // async: return promise of value
    return new Promise((resolve, reject) => {
        maybe_promise.then(
            result => resolve(main_inner(result)),
            error => reject(error),
        );
    });
}

async function main() {
    console.log("main_sync_async(inner_async) ...")
    var result = await main_sync_async(inner_async);
    console.log("main_sync_async(inner_async) done:", result)

    console.log("main_sync_async(inner_sync) ...")
    var result = main_sync_async(inner_sync);
    console.log("main_sync_async(inner_sync) done:", result)
}
main();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue This issue is good for contributors who are getting started
Projects
None yet
Development

No branches or pull requests

3 participants