Skip to content

Commit

Permalink
fix: should use new runtime to reconsider skipped connections
Browse files Browse the repository at this point in the history
  • Loading branch information
ahabhgk committed Feb 9, 2024
1 parent 51f0f0a commit e094bf3
Showing 1 changed file with 49 additions and 16 deletions.
65 changes: 49 additions & 16 deletions lib/buildChunkGraph.js
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ const { getEntryRuntime, mergeRuntime } = require("./util/runtime");
* @property {boolean | undefined} minAvailableModulesOwned true, if minAvailableModules is owned and can be modified
* @property {ModuleSetPlus[]} availableModulesToBeMerged enqueued updates to the minimal set of available modules
* @property {Set<Module>=} skippedItems modules that were skipped because module is already available in parent chunks (need to reconsider when minAvailableModules is shrinking)
* @property {Set<[Module, ConnectionState]>=} skippedModuleConnections referenced modules that where skipped because they were not active in this runtime
* @property {Set<[Module, ModuleGraphConnection[]]>=} skippedModuleConnections referenced modules that where skipped because they were not active in this runtime
* @property {ModuleSetPlus | undefined} resultingAvailableModules set of modules available including modules from this chunk group
* @property {Set<ChunkGroupInfo> | undefined} children set of children chunk groups, that will be revisited when availableModules shrink
* @property {Set<ChunkGroupInfo> | undefined} availableSources set of chunk groups that are the source for minAvailableModules
Expand Down Expand Up @@ -73,6 +73,25 @@ const bySetSize = (a, b) => {
return b.size + b.plus.size - a.size - a.plus.size;
};

/**
* @param {ModuleGraphConnection[]} connections list of connections
* @param {RuntimeSpec} runtime for which runtime
* @returns {ConnectionState} connection state
*/
const getActiveStateOfConnections = (connections, runtime) => {
let merged = connections[0].getActiveState(runtime);
if (merged === true) return true;
for (let i = 1; i < connections.length; i++) {
const c = connections[i];
merged = ModuleGraphConnection.addConnectionStates(
merged,
c.getActiveState(runtime)
);
if (merged === true) return true;
}
return merged;
};

const extractBlockModules = (module, moduleGraph, runtime, blockModulesMap) => {
let blockCache;
let modules;
Expand All @@ -99,9 +118,6 @@ const extractBlockModules = (module, moduleGraph, runtime, blockModulesMap) => {
if (!m) continue;
// We skip weak connections
if (connection.weak) continue;
const state = connection.getActiveState(runtime);
// We skip inactive connections
if (state === false) continue;

const block = moduleGraph.getParentBlock(d);
let index = moduleGraph.getParentBlockIndex(d);
Expand All @@ -115,48 +131,55 @@ const extractBlockModules = (module, moduleGraph, runtime, blockModulesMap) => {
modules = blockModulesMap.get((blockCache = block));
}

const i = index << 2;
const i = index * 3;
modules[i] = m;
modules[i + 1] = state;
modules[i + 1] = connection.getActiveState(runtime);
modules[i + 2] = connection;
}

for (const modules of arrays) {
if (modules.length === 0) continue;
let indexMap;
let length = 0;
outer: for (let j = 0; j < modules.length; j += 2) {
outer: for (let j = 0; j < modules.length; j += 3) {
const m = modules[j];
if (m === undefined) continue;
const state = modules[j + 1];
const connection = modules[j + 2];
if (indexMap === undefined) {
let i = 0;
for (; i < length; i += 2) {
for (; i < length; i += 3) {
if (modules[i] === m) {
const merged = modules[i + 1];
modules[i + 2].push(connection);
if (merged === true) continue outer;
modules[i + 1] = ModuleGraphConnection.addConnectionStates(
merged,
state
);
continue outer;

This comment has been minimized.

Copy link
@jdehaan

jdehaan Feb 28, 2024

Why this change of control flow?

Is this possibly causing (#18119 (comment)) some chunks to be omitted from the federation entry file? Unfortunately I have no clue what exactly is happening here, but formally this is a control flow change not aligned with the other changes which are purely injecting an additional 3rd info on the stack.

This comment has been minimized.

Copy link
@alexander-akait

alexander-akait Feb 28, 2024

Member

@jdehaan Can you provide a reproducible test repo?

This comment has been minimized.

Copy link
@jdehaan

jdehaan Feb 28, 2024

Unfortunately not really timely... We have a very huge federation setup with also cycles (that's why we used the game changer feature in the first place). I would need to go through hours of try and retest to see if the problem still occurs. I cannot afford that time at the moment. I went through the versions of webpack. 5.90.1 works ok but starting with 5.90.2 it started to fail but in a corner situation (> 95% of the application just worked well)

This comment has been minimized.

Copy link
@alexander-akait

alexander-akait Feb 28, 2024

Member

If you remove this line locally everything works fine?

This comment has been minimized.

Copy link
@jdehaan

jdehaan Feb 28, 2024

Thanks for that hint. I tried out to remove that line ona webpack 5.90.3 state and guess what. Nothing changed, the same transpiled code was generated byte to byte identical (with and without the line). So this is definitively a wrong place. Still definitively there is something that broke partly our app between 5.90.1 and 5.90.2. Do you have any idea which other spots I could try to revert alike this one to track down the issue?

This comment has been minimized.

Copy link
@jdehaan

jdehaan Feb 28, 2024

v5.90.1...v5.90.2#diff-090914f1f98fe81b6aea1dbd55d78274087c8a7a2c5ae6d8cd9788c227a91e88

I went through the most interesting diffs (non test) and what makes my app work fully again is to revert the changes to lib/util/numberHash.js back to the 5.90.1 state (only that file)

PS: it is enough to revert fnv1a32 only to fix my issue

Am I the victim of a hash collision?

This comment has been minimized.

Copy link
@alexander-akait

alexander-akait Feb 28, 2024

Member

Intresting... Very intresting... Sounds like yes, can you log numberHash before and after?

/cc @dmichon-msft

This comment has been minimized.

Copy link
@jdehaan

jdehaan Feb 28, 2024

I'm working on it... It's a bit unclear to me what is important to log, I found a place with salt ident and the resullting id in lib/ids/IdHelpers.js

This comment has been minimized.

Copy link
@jdehaan

jdehaan Feb 28, 2024

broken.md.gz
OK.md.gz

I instrumented lib/ids(IdHelper.js with

	for (const item of items) {
		const ident = getName(item);
		let id;
    let s = salt
		let i = salt;
		do {
      s = i
			id = numberHash(ident + i++, range);
      console.log(`numberHash: ${i},${range},${id},"${ident}"`);
		} while (!assignId(item, id));
	}

To get the information with grep/sort/remove duplicates I got a little bit more condensed lists of ids in the good and broken case. I tried to identify problematic points but honestly speaking I cannot really see clearly what is the issue.

image

At the problematic not resolved chunk i is and e=681:
image

The index 681 is not set inside that array which leads to the crash. Probably the wrong table is being looked at due to a collision, but I am totally lost on what to look at to find the root cause. I hope the output helps you in identifying a possible cause!

This comment has been minimized.

Copy link
@dmichon-msft

dmichon-msft Feb 28, 2024

Contributor

Everything that uses numberHash is expected to expressly handle hash collisions, so this would be a bug being surfaced by changing where collisions happen.

This comment has been minimized.

Copy link
@dmichon-msft

dmichon-msft Feb 28, 2024

Contributor

Are hash collisions in module ids handled uniformly on both sides of the module federation compilation boundary?

This comment has been minimized.

Copy link
@jdehaan

jdehaan Feb 29, 2024

Oh dear... Now I understand why there was a while loop with trying to assign the id to the item being tackled. I should probably repeat the logging but inside the assignId function on success...

	for (const item of items) {
    console.log('item:', item);
		const ident = getName(item);
		let id;
    let s = salt
		let i = salt;
		let b = false;
    do {
      s = i
			id = numberHash(ident + i++, range);
      b = assignId(item, id);
      console.log(`numberHash: ${s},${range},${id},${b},"${ident}"`);
		} while (!b);
	}

I instrumented the code like this now yielding a very big file (gz compressed 5MB) and dumped the whole console output of the full build. I get the item, each attempt to assign the id to the item along with the salt, ident, range and resulting id and if the assignId succeeded or not. I tried to follow what is going on and where it went wrong but I fear I'm lost. If you do feel this can help you to find the cause? then I can transmit you the file, I would refrain to publish it here because of its size and that it's for everybody else more like garbage...

This comment has been minimized.

Copy link
@jdehaan

jdehaan Feb 29, 2024

Are hash collisions in module ids handled uniformly on both sides of the module federation compilation boundary?

I do not know how to exactly check this... I currently do not understand the underlying techniques well enough I fear.

This comment has been minimized.

Copy link
@dmichon-msft

dmichon-msft Feb 29, 2024

Contributor

What you would be looking for in that data dump would be consecutive lines where the value of ident is the same; those will be the hash collisions. Thing is, due to the size of the space, encountering some hash collisions is expected when running at scale.

Since you are missing module 681, what I'd be inclined to do is try to identify what source module 681 corresponds to, then look for that module in the chunks emitted from each participant in the module federation to find out if it is being assigned a different ID in some cases. For ease of module identification, turning off minification should at least make the output readable.

Another way to test if this is hash collision based would be to change the value of the salt in the webpack configuration and see if it still repros, and especially if it affects the same module.

}
}
modules[length] = m;
length++;
modules[length] = state;
length++;
modules[length] = [connection];
length++;
if (length > 30) {
// To avoid worse case performance, we will use an index map for
// linear cost access, which allows to maintain O(n) complexity
// while keeping allocations down to a minimum
indexMap = new Map();
for (let i = 0; i < length; i += 2) {
for (let i = 0; i < length; i += 3) {
indexMap.set(modules[i], i + 1);
}
}
} else {
const idx = indexMap.get(m);
if (idx !== undefined) {
const merged = modules[idx];
modules[idx + 1].push(connection);
if (merged === true) continue outer;
modules[idx] = ModuleGraphConnection.addConnectionStates(
merged,
Expand All @@ -168,6 +191,8 @@ const extractBlockModules = (module, moduleGraph, runtime, blockModulesMap) => {
modules[length] = state;
indexMap.set(m, length);
length++;
modules[length] = [connection];
length++;
}
}
}
Expand Down Expand Up @@ -207,7 +232,7 @@ const visitModules = (
*
* @param {DependenciesBlock} block block
* @param {RuntimeSpec} runtime runtime
* @returns {(Module | ConnectionState)[]} block modules in flatten tuples
* @returns {(Module | ConnectionState | ModuleGraphConnection[])[]} block modules in flatten tuples
*/
const getBlockModules = (block, runtime) => {
if (blockModulesMapRuntime !== runtime) {
Expand Down Expand Up @@ -382,7 +407,7 @@ const visitModules = (
/** @type {QueueItem[]} */
let queueDelayed = [];

/** @type {[Module, ConnectionState][]} */
/** @type {[Module, ModuleGraphConnection[]][]} */
const skipConnectionBuffer = [];
/** @type {Module[]} */
const skipBuffer = [];
Expand Down Expand Up @@ -582,7 +607,7 @@ const visitModules = (
const { minAvailableModules } = chunkGroupInfo;
// Buffer items because order need to be reversed to get indices correct
// Traverse all referenced modules
for (let i = 0; i < blockModules.length; i += 2) {
for (let i = 0; i < blockModules.length; i += 3) {
const refModule = /** @type {Module} */ (blockModules[i]);
if (chunkGraph.isModuleInChunk(refModule, chunk)) {
// skip early if already connected
Expand All @@ -592,7 +617,11 @@ const visitModules = (
blockModules[i + 1]
);
if (activeState !== true) {
skipConnectionBuffer.push([refModule, activeState]);
const connections = /** @type {ModuleGraphConnection[]} */ (
blockModules[i + 2]
);
skipConnectionBuffer.push([refModule, connections]);
// We skip inactive connections
if (activeState === false) continue;
}
if (
Expand Down Expand Up @@ -666,7 +695,7 @@ const visitModules = (

if (blockModules !== undefined) {
// Traverse all referenced modules
for (let i = 0; i < blockModules.length; i += 2) {
for (let i = 0; i < blockModules.length; i += 3) {
const refModule = /** @type {Module} */ (blockModules[i]);
const activeState = /** @type {ConnectionState} */ (
blockModules[i + 1]
Expand Down Expand Up @@ -1172,7 +1201,11 @@ const visitModules = (
/** @type {ModuleSetPlus} */
(info.minAvailableModules);
for (const entry of info.skippedModuleConnections) {
const [module, activeState] = entry;
const [module, connections] = entry;
const activeState = getActiveStateOfConnections(
connections,
info.runtime
);
if (activeState === false) continue;
if (activeState === true) {
info.skippedModuleConnections.delete(entry);
Expand Down Expand Up @@ -1286,7 +1319,7 @@ const visitModules = (
return;
}

for (let i = 0; i < blockModules.length; i += 2) {
for (let i = 0; i < blockModules.length; i += 3) {
const refModule = /** @type {Module} */ (blockModules[i]);
const activeState = /** @type {ConnectionState} */ (
blockModules[i + 1]
Expand Down

0 comments on commit e094bf3

Please sign in to comment.