Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brainstorm ideas on how to improve "Principles for Package Repository Security" from CISA OSS Summit #40

Open
david-a-wheeler opened this issue Mar 5, 2024 · 2 comments

Comments

@david-a-wheeler
Copy link

Thank you SO MUCH for your work on the "Principles for Package Repository Security".

Today several of us brainstormed about ways to possible improve it, as part of the CISA OSS Summit. Below are notes on our ideas. These are not gospel, they are notes from a brief discussion. Still, I hope that they'll be useful.

--- David A. Wheeler

=======

We answer this as consumers of package repos, as the individuals in this group aren’t maintainers of one.

Many repos, e.g., CRAN, would probably like to comply with many of these, but they have no resources to do so. Need to find a way to resource them to implement these changes, as well as funding to improve sustainment.
A related issue: All of the repo groups are struggling to prioritize of this work vs. other work they need to do.

Is there a staff to support? Most repos don’t have 10 people staffed, which is often the number for supporting major infrastructure services. Especially to maintain MFA, keeping things operational.

What about namespacing? That seems to be missing here. Attacks on namespace are a big issue. Biggest problems are typosquatting & dependency confusion, need to counter.
A big problem is visibility. Repos don’t know how many people are affected by certain attacks.

These requirements are too specific. It would be better if it first identified the general goal (what are you trying to accomplish). Talk about outcomes first, then use these as examples of how to achieve the outcome. The SSDF is an example of this more general approach. Maybe even investigate making this a “projection” of the SSDF.

Need to find a way to triage across the space.

EFF did a great job on the TLS “Wall of Shame” to measure things - it’s important to make how well package repos implement various requirements. Create a “Scorecard” for common understanding of how repos are doing. We really don’t want to shame people, the goal is to help people understand situation.

Organizations trust CLI tools to do things, e.g., bring in software without being vulnerable to dependency confusion or typosquatting. Needs to be easy to say “don’t download THIS”.

In many package repos, you can’t confirm that the pre-built packages are related to the claimed source code, making it easy to insert malicious code. Mechanisms like “building separately” (Go, Homebrew) or verifying reproducible builds could eliminate one vulnerability.

Maven & npm are commercial. PHP is muddly. Everyone else is non-commercial & has little in the way the way of resources.

Need Resources:
Money
Skilled people
Community.

Connecting with other people who are working on similar problems is very helpful.

Publishers of individual projects and consumers of projects will push back. There’s a community of users who don’t want changes. Having a third party that publishes “what you should do” is really helpful, we need to get to a point where these principles are refined further & widely agreed on.

It needs to be EASY for publishers & consumers. This means the repos will need to do work.

Needs to be added: operational resiliency.
CDN risks aren’t identified here! Many depend on Fastly & one-year contracts, that contract goes away & service disappears. (Fastly is moving to 5-year contracts thankfully!)
What do you do with private keys & how are they shared?

There’s the repo itself. But there’s also the security of the build process to generate the CLI & repo components.

There should be a secure-by-default manner for packages. Make it easy to do the safe thing. E.g., you’d have to work to NOT pin, you’d have to work to have dependency confusion, etc.

Need to secure build & operational infrastructure of package repo. None of these points matter if the build or operational infrastructure are not resilient / insecure. The repo needs to be at least as secure as its most secure component. The current principles list focuses on security functionality.

Perhaps report “average libyears” in CLI, track at repo.

We’d encourage every ecosystem to learn about how their community feels about these changes before rolling them out. Platonic ideal isn’t enough, and priorities might vary.

@di
Copy link
Member

di commented Mar 7, 2024

Here's my notes from the summit:

  • The current principles doc is missing a technical spec for why/how to implement something: what does a certain thing get you, how should it be done?
  • Similarly, there is some vagueness around things like typo squatting: what qualifies as typo squatting mitigations? What shouldn't be done to mitigate this?
  • There was some interest in a less level-based document and more of a checkboxes/scorecard for how a repository aligns with the recommendations
  • Absent from the doc is a discussion of metadata: specifically the verification and protection (e.g., verifying the upstream source repository, verifying the homepage of a project, verifying the email address publicly associated with the project)
  • Absent from the doc is a focus on usability over time: immutability of artifacts on the index, and the availability of lock files to help ensure something installed today doesn't break or change 1 year later
  • Regarding RBAC: there is not suggestions or guidance on organizational control (rather than individual user accounts)
  • There's no discussion on identifying/mitigating single points of failure: For example: Fastly as a CDN for many repositories, GitHub as an authz/authn provider for crates.io, overall resiliency and availability
  • At first glance, the levels seem progressive, but they can be mix and match, and actually corresponds to resources/impact for a given ecosystem
  • Nothing about the presence of roadmaps: the highest priority items might exist but might not be immediately obvious externally (e.g. to funders)

@carols10cents
Copy link

Here are some notes I took that overlap with some topics already mentioned:

  • When some of us were trying to score our own registries on the levels in v0.1, there often wasn't a really clear "level 1" or "level 2" -- it was often "mostly level 1 but one item from level 3" or "level 2 but missing one item from level 1". Having the levels be numbers gives the impression that the levels strictly build on each other, and also makes it seem like it should be possible to get one number out at the end but reality is a bit messier :)
  • Perhaps rather than numbered levels, the groupings of "lower maturity" -> "higher maturity" items could be colors or something, and each bullet point within each level could be its own checkbox. Or at least have a more visual, scannable "scorecard" type display of the information that each package registry could use to communicate current functionality and future priorities. In other words, it would be great to have a simpler view of the items in these recommendations.
  • Similarly, there were often details that seemed like too much information for this view, but would still be really useful for knowledge sharing and implementation. Information such as justifications, technical requirements, example implementations
  • If the package repository security principles are a self-assessment, how can third parties verify the assessment?
  • A feature that might be useful in package repositories is, for some library A with an older vulnerable version and a newer, patched version, have a way to view which other libraries' latest versions are still depending on the old version of library A
  • When someone releases experimental code on a package repository, is there a way they can signal they DON'T think people should depend on it? Especially from a trusted party like a large company where usually their libraries are very trustworthy, how can they communicate when a library ISN'T ready for production use?
  • When there are mirrors of package repositories, how do the mirrors find out in a timely manner that the upstream package repository has taken some action on a particular package version, such as taking it down because it contained malware?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants