Have a SBOM for Node.js? #1115

marco-ippolito · 2023-09-20T13:20:30Z

I think it would be great to have a SBOM for the project now that we are working on dependency build audit.
Probably investigate on how we can achieve this since we have different types of dependencies and which format.

RafaelGSS · 2023-09-21T13:06:03Z

IIRC besides the SLSA + SigStore work I think @BethGriggs also have looked to SBOMs, right? Would you mind sharing your point of view?

BethGriggs · 2023-09-21T13:54:31Z

I had some initial thoughts, but didn't get too far. Some of them:

GitHub produces a SBOM downloadable from Insights on the repository. This is incomplete and I believe only shows the more-easily discoverable npm and actions dependencies.
- I know in security the stance is often that small mitigations are better than no mitigations. But in this case, I feel an incomplete SBOM is probably worse than no SBOM.
Typical SBOM tools tend to assume you're building an application using a specific runtime/language. For example, they'll just traverse the node_modules and generate the SBOM from that. Our mixture of runtimes/languages used complicates things.
Information that would be really useful to ship in an SBOM alongside our binaries is information of dependencies we build directly against, and where their source came from. This is so users can easily know the version of node they're using depends directly on dependency x from this source, and can feed it into tools that monitor their SBOMs, etc.
maintaining-dependencies.md is a good start, but I came to the conclusion the SBOM should really be generated at build time. This is because some of our dependencies can be externalised or swapped out during the build step (for example, building against a system OpenSSL). What is in the deps directory in our sources may not match what is actually built.
Another approach I thought about was just using the values that get built into process.versions. It could be a reasonable interim step. It felt a bit odd (even risky?) to rely on executing the software to determine what's being used in it. I feel doing it at the build stage would allow us to gather more detail (which source was used) and verification rather than just reporting versions.

marco-ippolito · 2023-10-03T12:56:39Z

I see that CycloneDX is quite popular, should we give it a try? What kind of tool should we use?

mhdawson · 2023-10-03T18:57:04Z

+1 for CycloneDX

marco-ippolito · 2023-10-04T09:33:22Z

So I gave it a try on my machine and unfortunately my macbook went OOM and crashed.
Since Node is a fairly large project it's an expensive operation that falls into the case described by documentation: https://github.com/CycloneDX/cdxgen/blob/master/ADVANCED.md#use-atom-in-java-mode

I was wondering if it was possible to have access to a machine with 32/64gb of ram to run it.

mhdawson · 2023-10-04T14:30:29Z

I think the only machines we have that have that much memory might be:

test-nearform_intel-ubuntu2204-x64-1 test-nearform_intel-ubuntu2204-x64-2

I'd suggest you open an issue in the build repo to request access to one of those.

marco-ippolito · 2023-10-17T08:14:36Z

The ideal goal is to ship a SBOM for every executable we release, since every platform might have slight difference settings, tools, dependendencies (? I'm not sure this is true). I guess it should eventually, be included at build time as a release step, @RafaelGSS.

It is also possible to generate the SBOM starting from a csv file manually, which might be easier and less expensive in terms of computing but hard to maintain, not big fan of this idea.

Also we should define a end goal for the project in terms of SBOM quality https://scvs.owasp.org/scvs/v2-software-bill-of-materials/ I assuming we start from the basic

My idea is to start quick with https://github.com/CycloneDX/cdxgen which is a "generalistic" tool and then refine and improve quality with further developments and more specific tools

richardlau · 2023-10-17T12:17:27Z

So I gave it a try on my machine and unfortunately my macbook went OOM and crashed. Since Node is a fairly large project it's an expensive operation that falls into the case described by documentation: https://github.com/CycloneDX/cdxgen/blob/master/ADVANCED.md#use-atom-in-java-mode

I was wondering if it was possible to have access to a machine with 32/64gb of ram to run it.

The ideal goal is to ship a SBOM for every executable we release, since every platform might have slight difference settings, tools, dependendencies (? I'm not sure this is true). I guess it should eventually, be included at build time as a release step,

Dependencies are the same for the platforms we currently release. Tooling (compilers, Python, etc) do differ.

If CycloneDX requires that amount of RAM to run for Node.js it's not going to be realistic to run on every platform we release on. Most of the release machines have 4GB RAM (some have 2GB+swap and a small number have 8GB).

pombredanne · 2023-10-17T13:45:47Z

@marco-ippolito repasting my post from the CycloneDX chat:

cdxgen is a good start! For a large codebase like node.js, here are my extra 2 cents:

IMHO your problem is not so much npms or pypi that are easy to inventory because they have package manifests, but the rest of the C/C++ code and its deps that are vendored or not but have no manifest, like zlib, cares, and similar and their nested and bundled deps all the way down (like in V8)
You may document their origin and licenses in the codebase. I use small YAML files for this, you could use a small CycloneDX SBOM to the same effect. Conceptually something like this https://github.com/nodejs/node/blob/main/deps/zlib/README.chromium#L1 but improved to have proper Package URL/purls. This will get you an explicit list that you can then have scanners collect in addition to the simpler npms or Python package.
Or you might want to match against a reference index of C/C++ packages for these too, in which case you need a code matching tool and a reference DB. Or do a combo of 2. and 3. which is best IMHO. Then eventually you will need to craft and run a custom pipeline assemble data from a few different tools and origins to get something that is tailored to node.js
You may want to consider also analyzing the deployed (debug) binaries rather than the sources code to craft an SBOM that is based on the subset of the sources effectively used. This is effectively what users and security teams will care for, not the (many) other development-only packages that are not deployed
You really want to get proper Package URLs/purls in your CycloneDX output for this to be useful for downstream users when querying for vulnerability in modern databases. If you have a few CPE that will not hurt either!
This is a process. Do not expect to get any open source or commercial tool to get you the correct results out of the box. This will require tuning and a custom pipeline to automate all this. And the output of running this pipeline will require regular review for accuracy.

I have some experience in the domain and I may be able to help modestly.

pombredanne · 2023-10-17T23:27:15Z

@BethGriggs re: #1115 (comment)

I came to the conclusion the SBOM should really be generated at build time. This is because some of our dependencies can be externalised or swapped out during the build step (for example, building against a system OpenSSL). What is in the deps directory in our sources may not match what is actually built.

💯 ... if you can instrument your build to collect the subset of third-party code that you effectively include (and possibly external deps that may be expected at runtime), then this is IMHO the best possible case and something that I would always recommend.

marco-ippolito · 2023-11-01T06:50:59Z

@pombredanne so my idea to get started is :

run cdxgen for each package in /deps folder for npm packages,
run cdxgen for tools and github actions
document their origin and licenses for V8 and OpenSSL and other c++ dependencies

Would you suggest some tools for your point 3 and 4? or some reference

prabhu · 2023-11-01T12:32:24Z

I will work on improving the performance of cdxgen/atom for the c/c++ codebase. It has to be done regardless of whether node.js becomes a user or not. My initial focus would be on the time to reduce it to less than an hour for v8. Reducing the memory footprint to make it run in a CI agent for such large codebases is impossible, so it is not going to be my priority this year.

@marco-ippolito I like your proposal to generate individual SBOMs per folder in deps. CycloneDX supports linking SBOMs using BOM-Link under external references.

I have created this ticket to automate this process a bit. Once all the performance tickets are done, I am happy to share an example workflow with the right arguments needed to generate these.

pombredanne · 2023-11-06T19:45:56Z

@marco-ippolito you wrote:

so my idea to get started is :
1. run cdxgen for each package in `/deps` folder for npm packages,

2. run cdxgen for tools and github actions

3. document their origin and licenses for V8 and OpenSSL and other c++ dependencies
Would you suggest some tools for your point 3 and 4? or some reference

I suggest you get something started first with your plan.

For 3 and 4 I have some bit that are work in progress at https://github.com/nexb/elf-inspector and https://github.com/nexB/purldb/ ... scancode-toolkit also has some code to collect metadata from the README.chromium files used to document the metadata.

BTW, are there debug builds with debug symbols available? (with DWARFs for Linux and macOS and a PDB for Windows)

prabhu · 2023-11-06T20:13:03Z

cdxgen 9.9.2 was released with the required improvements. Will share an example workflow that will do both 1 and 2. (Aiming for single invocation). For 3, cdxgen currently supports vcpkg.json format to share additional metadata. You can create this file within the various folders, and the information will be used in the generated SBOM. Will also share some examples of this as well.

marco-ippolito · 2023-11-21T13:22:04Z

I'm wondering which installation method should we use on our machine, link to guide guide

prabhu · 2023-11-21T14:09:15Z

I'm wondering which installation method should we use on our machine, link to guide guide

npm install with Java 21 must work. For CI, we can have a workflow that sets up the prereqs.

UlisesGascon · 2023-12-26T09:44:21Z

I was reading about the possibilities to use SBOM in Docker images, and it seems that is possible using docker sbom or docker buildx build --sbom=true -t <myorg>/<myimage> --push . This might be a good option for the Docker Official images. What do you think?

References

mrutkows · 2024-01-08T22:34:32Z

IMO, CycloneDX is the way to go (as it becomes an Ecma and hopefully an ISO standard with v1.6 due in Feb.) and will need to eventually have their specified ability to declare (quantum) crypto information and actual attestations as consumers are able to produce them.

github-actions · 2024-04-18T00:14:20Z

This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made.

UlisesGascon added the security-wg-agenda label Sep 20, 2023

mhdawson mentioned this issue Sep 25, 2023

Node.js Security team Meeting 2023-09-28 #1118

Closed

marco-ippolito mentioned this issue Oct 9, 2023

Access to machine for SBOM generation nodejs/build#3513

Closed

mhdawson mentioned this issue Oct 9, 2023

Node.js Security team Meeting 2023-10-12 #1127

Closed

mhdawson mentioned this issue Oct 23, 2023

Node.js Security team Meeting 2023-10-26 #1134

Closed

mhdawson mentioned this issue Nov 6, 2023

Node.js Security team Meeting 2023-11-09 #1146

Closed

mhdawson mentioned this issue Nov 20, 2023

Node.js Security team Meeting 2023-11-23 #1154

Closed

mhdawson mentioned this issue Dec 4, 2023

Node.js Security team Meeting 2023-12-07 #1165

Closed

marco-ippolito mentioned this issue Dec 17, 2023

doc: remove version from maintaining-dependencies.md nodejs/node#51195

Merged

mhdawson mentioned this issue Dec 18, 2023

Node.js Security team Meeting 2023-12-21 #1169

Closed

mhdawson mentioned this issue Jan 1, 2024

Node.js Security team Meeting 2024-01-04 #1175

Closed

marco-ippolito mentioned this issue Jan 8, 2024

OpenJS Foundation Security Collab Space Meeting 2024-01-08 openjs-foundation/security-collab-space#97

Closed

mhdawson mentioned this issue Jan 15, 2024

Node.js Security team Meeting 2024-01-18 #1196

Closed

RafaelGSS removed the security-wg-agenda label Jan 18, 2024

RafaelGSS mentioned this issue Mar 14, 2024

Node.js Security Initiatives 2024 #1255

Closed

github-actions bot added the stale label Apr 18, 2024

marco-ippolito added never-stale and removed stale labels Apr 19, 2024

RafaelGSS mentioned this issue May 8, 2024

Nominating @marco-ippolito to TSC nodejs/TSC#1550

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have a SBOM for Node.js? #1115

Have a SBOM for Node.js? #1115

marco-ippolito commented Sep 20, 2023 •

edited

RafaelGSS commented Sep 21, 2023

BethGriggs commented Sep 21, 2023

marco-ippolito commented Oct 3, 2023 •

edited

mhdawson commented Oct 3, 2023

marco-ippolito commented Oct 4, 2023

mhdawson commented Oct 4, 2023

marco-ippolito commented Oct 17, 2023 •

edited

richardlau commented Oct 17, 2023

pombredanne commented Oct 17, 2023

pombredanne commented Oct 17, 2023

marco-ippolito commented Nov 1, 2023

prabhu commented Nov 1, 2023 •

edited

pombredanne commented Nov 6, 2023 •

edited

prabhu commented Nov 6, 2023

marco-ippolito commented Nov 21, 2023

prabhu commented Nov 21, 2023

UlisesGascon commented Dec 26, 2023

mrutkows commented Jan 8, 2024

github-actions bot commented Apr 18, 2024

Have a SBOM for Node.js? #1115

Have a SBOM for Node.js? #1115

Comments

marco-ippolito commented Sep 20, 2023 • edited

RafaelGSS commented Sep 21, 2023

BethGriggs commented Sep 21, 2023

marco-ippolito commented Oct 3, 2023 • edited

mhdawson commented Oct 3, 2023

marco-ippolito commented Oct 4, 2023

mhdawson commented Oct 4, 2023

marco-ippolito commented Oct 17, 2023 • edited

richardlau commented Oct 17, 2023

pombredanne commented Oct 17, 2023

pombredanne commented Oct 17, 2023

marco-ippolito commented Nov 1, 2023

prabhu commented Nov 1, 2023 • edited

pombredanne commented Nov 6, 2023 • edited

prabhu commented Nov 6, 2023

marco-ippolito commented Nov 21, 2023

prabhu commented Nov 21, 2023

UlisesGascon commented Dec 26, 2023

mrutkows commented Jan 8, 2024

github-actions bot commented Apr 18, 2024

marco-ippolito commented Sep 20, 2023 •

edited

marco-ippolito commented Oct 3, 2023 •

edited

marco-ippolito commented Oct 17, 2023 •

edited

prabhu commented Nov 1, 2023 •

edited

pombredanne commented Nov 6, 2023 •

edited