Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Public Key Infrastructure for Rust Project #3579

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
35 changes: 22 additions & 13 deletions text/3579-pki.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,18 +52,20 @@ Utilizing our PKI and certificate hierarchy, end users (manually or utilizing `r
# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

We propose leveraging [AWS CloudHSM Quorum Authentication](https://docs.aws.amazon.com/cloudhsm/latest/userguide/quorum-authentication.html) for the management of the `Root Key`. See [Alternatives#Quorum] for out-of-band quorum models which were considered.
We propose leveraging [AWS CloudHSM Quorum Authentication](https://docs.aws.amazon.com/cloudhsm/latest/userguide/quorum-authentication.html) for the management of the `Root Key`. See [Alternative Quorum Tools][alternative-quorum-tools] for other quorum models which were considered.

All operations using the root key or subkeys will be logged transparently, and will make use of Certificate Transparency / Binary Transparency infrastructure to ensure that no surreptitious signing operations can take place.

## Keys and Trust

**Key Algorithm:** ECDSA secp384r1
**Key Algorithm:** The root key will use secp384r1, since this is supported by CloudHSM and a variety of other HSMs. Delegate keys may use other algorithms (e.g. ed25519) as appropriate and specified in the proposal for a delegate key. We should re-evaluate the algorithm strength each time we rotate the root key.

**Storage:** PKCS #11 Certificates Stored in CloudHSM
**Storage:** PKCS #11 Certificates Stored in CloudHSM. These are backed up in-cloud every 24 hours. In the event complete loss of key material occurs, it shall be treated as a full out-of-cycle rotation.

**Expiration:** Root and Top-Level Delegate keys shall follow a `7 year expiration` schedule, except for the first keys of this proposal, which shall have an expiration date of the expected release date of the Rust 2030 edition plus 1 year.

Expiration allows us to re-evaluate algorithm choice and algorithm strength, archive transparency logs, and have a well-tested path for root key replacement.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these can be equally achieved by adding new roots and removing old ones on some cadence. In other words, embedding the time into the root isn't actually required for them IMO: we (to some extent) control clients and regularly ship new clients.


### First-Issue Expiration Plan

We shall schedule expiration on the key to match the `Rust Edition Cycle`, so any key expiration will also fall within the time frame of a new Rust edition. This cadence is chosen to ease the rotation and release and dissemination of a new `Root Key` aligning with an edition and thus, new releases of `cargo` and associated tooling.
bjorn3 marked this conversation as resolved.
Show resolved Hide resolved
Expand Down Expand Up @@ -114,17 +116,18 @@ These individuals should be available in the event quorum is needed for root key

### Delegate Keys
joshtriplett marked this conversation as resolved.
Show resolved Hide resolved

- Infra Key
Key for infrastructure team to utilize for server operations. The Infra team may create further delegate keys as necessary; we expect this key to be used for server-side and client-side certificates and other infrastructure-related uses.
Delegate keys shall be used for purposes within the realms of their designated responsibility and team(s). It shall be up to the individual implementors using these keys to make sure to verify their trust up to the Rust root or to the appropriate delegate key for their usage. For example, it shall be an implementation detail for the crates.io and cargo teams on whether to verify signatures up to the Rust root key, or to their specific delegate keys for other purposes. This is done to allow for the greatest flexibility of users of the keys.

Delegate keys will use appropriate mechanisms within the key format to specify the purposes the key can be used for, and verifiers relying on delegate keys should then verify those purposes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the mechanism be described, for people not deeply familiar with how it usually works in a X.509 CA?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this section should specify what the EKU (Extended Key Usage) OIDs are for each of the usages.

Replace "appropriate mechanisms within the key format" with "an EKU ([Extended Key Usage][rfc5280])" and note that the correct EKU must be present in every intermediate and end-entity certificate but not necessarily the root.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an operational note from someone who implemented X.509 validation: handling of EKUs and similar policy mechanisms (like policy constraints) is not super consistent between different implementations -- the "right" thing to do in many cases is to over-index and support only what the Web PKI requires, which may not be compatible with the extensions/profile this PKI defines.

(Many implementations also make it hard to define these kinds of profile requirements, e.g. "only leaf has EKU" and "everything has EKU" are both common but "everything except the trust anchor has EKU" will involve rejecting chains after they've already been validated.)


- Release Key
Used to sign Rust nightly and stable releases. rustup can verify that downloaded releases are signed by a key with this role that chains up to the root, and verify that the key and the release signature are recorded via Certificate Transparency / Binary Transparency authenticated logging mechanisms. Another subkey of this may be used for code signing.

- crates.io Key
Top-level key for the crates.io ecosystem and project. This will be an authoritative key for signing indexes, and potentially signing crate packages directly. (For logistics reasons, crates.io may choose to issue subkeys under this key to perform signing.) Note that even if crates.io signs package files, it still needs to sign indexes for verification as well, because cargo gets some information from the index rather than the crate.

- BORS Key
Key for bors to sign all git commits it makes. This allows people to use mirrors of our git repositories and verify the authenticity of those mirrors. We will document a procedure to download and update Rust git repositories from a mirror, with verification, while working with the original for issues and pull requests.
- bors Key
Mark-Simulacrum marked this conversation as resolved.
Show resolved Hide resolved
bors will use per-repository subkeys of this key to sign all git commits it makes. This allows people to use mirrors of our git repositories and verify the authenticity of those mirrors. We will document a procedure to download and update Rust git repositories from a mirror, with verification, while working with the original for issues and pull requests. This delegate key shall be used to generate further subkeys to be used on a per-repository basis; the delegate key itself will not be used to sign git commits. See [using-github-bot-sign][using-github-bot-sign] for details on the threats posed by using built-in github bot signing.

### Key Transparency (`pki.rust-lang.org`)

Expand Down Expand Up @@ -160,9 +163,10 @@ The Rust Infra Team will deploy a new SSH bastion host, `pki-bastion.rust-lang.o
The Rust Infra Team shall stand up a new subdomain for the exposure of transparency logs and operational logs of the PKI. We leave the implementation details of such a system up to the Infra team and project. This system shall store and publicly serve all data which relates to the activities of the Rust CA, its delegate keys and quorum activities.

This publicly accessible data shall be:
- A write-only storage medium in which modification of past entries is not possible.
- Written documentation of this RFC and additional documentation on the services, scripts and processes for the quorum model.
- Access logs to the [Signing Console][signing-console] containing the username and datetime of access. This shall be signed by the Infra key or a delegate of that key.
- Access logs to the CloudHSM instance, containing IAM role and datetime of access. This shall be signed by the Infra key or a delegate of that key.
- Access logs to the [Signing Console][signing-console] containing the username and datetime of access. This shall be signed by a verifiable delegate key which chains trust to the Rust Root..
- Access logs to the CloudHSM instance, containing IAM role and datetime of access. This shall be signed by a verifiable delegate key which chains trust to the Rust Root.
- Written records of all Quorum Events. These shall contain: Event Description, Quorum members present, date/time of beginning and ending the event, cryptographic fingerprints of keys modified, created or revoked. This written record shall be signed independently by each quorum member present for the event.


Expand Down Expand Up @@ -205,15 +209,20 @@ We shall utilize delegate keys for actual day-to-day operations within the proje
Instead of a quorum authentication model for a legacy Certificate Authority set of Root Keys, we could move to a multi-signer scheme which will be validated for all the uses cases we expect this to cover. The drawback of such an approach is that most standard scenarios in which we will want to utilize such a set of key instances do not support such a model; it would be an approach to only solve the issue of repository and crate signing. We have legitimate cases which can benefit from an SSL CA (Infra internal services, code signing, etc.) which do not support multi-signer keys. We hope this RFC will be leveraged for future uses, but consider that question out of the scope of this RFC.

### Alternative Quorum Tools
[alternative-quorum-tools]: #alternative-quorum-tools

Other solutions exist for performing Quorum-based authentication for key access, which support various storage backends. The main alternative for such a standalone solution would be utilizing a quorum authentication plugin on a cloud-agnostic storage system (Such as Hashicorp Vault, or a cloud-agnostic authentication scheme atop a cloud key storage, such as AWS key store). These solutions would allow us to deploy and move independent of cloud providers; however, this comes with the added infrastructure overhead of the Infra team needing to implement and maintain such a solution. This choice was considered but thought prohibitive to implement and maintain for the current staffing available to the project. Finally, these solutions do not exist in hardware due to the need to remain cloud and hardware agnostic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Valut-like tools, although it requires some extra time and effort to implementand maintain, it can not only be used as a key management tool for rust projects, but also as a secrect management tool for the entire Rust project to manage other Secrects like tokens or AK/SK in a unified manner. So I think it's possible to consider


Given the majority of Rust Infrastructure currently resides on AWS and will for the near-to-medium term future, CloudHSM was chosen as the solution of choice for this given it meets all requirements. In the future, it is possible to extract our private keys and implement quorum authentication in another system. (Such private key extraction may have a higher quorum threshold.)

### Using the GitHub bot key to sign bors commits
[using-github-bot-sign]: #using-the-github-bot-key-to-sign-bors-commits

As an alternative to having a dedicated delegate key for bors that chains to the Rust Root, we could rely on GitHub's commit signatures for bot accounts. However, this would not allow users of git mirrors to be resilient against impersonation of rust-lang/rust. Any GitHb project that uses bors could push a branch that is a clone of rust-lang/rust and get bors to add and sign malicious commits. That history could then be pushed to a purported "mirror" and would appear the same as a legitimate mirror of rust-lang/rust, allowing for social engineering attacks. Using per-repository keys makes mirrors easier to verify and reduces attack surface area.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think per-repository keys managed by us help with this problem at all.

To preface, we are using an instance of bors specific to our organization, so regardless of which solution we choose all repositories will be in the rust-lang org, and attackers will need to have to be members of the team / compromise members of the team.

For this attack to work, you need to have a base branch with the content of rust-lang/rust, and a PR you instructed bors to merge with that branch. There are two candidates for such branch:

  1. The main/master branch, so it's either rust-lang/rust itself and the attacker compromised rustc proper (signatures wouldn't matter here), or it's a different repository with a separate history (let's say clippy) which would result in a failed merge due to endless conflicts.

  2. An attacker-controlled branch (let's call it mastr to sprinkle some typosquatting) containing malicious code, to which the attacker sends a benign PR tricking reviewers into thinking it's going to be merged into master. There is no difference from this happening in let's say clippy or rustc proper, so separate keys wouldn't help.

Case 1. is not relevant, and case 2. can be fixed by adding a check to bors preventing it from merging into non-protected branches, regardless of whether we use GitHub keys or our own.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of 1. the attacker itself could merge the default branch of the target repo with --keep-ours to resolve all conflicts and only then open the PR on the target repo. Merging the PR should then result in no conflicts at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that still cause a huge noticeable PR changing hundreds of thousands of lines of code? If we let's say merge rustc into clippy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does. Other repos have less oversight and get reviewers more easily afaik, so there is a higher risk of the attacker itself getting review permission for one of these repos and the attack not being detected than for the main rust repo. Whether this attack scenario matters enough to do this is another question though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're really worried about this attack GitHub keys would still work, we would just need to create a separate GitHub App for rust-lang/rust. I don't think realistically this is going to happen, as smaller repos are being (very slowly) migrated away from bors to GitHub Merge Queues.


### Staying Provider Agnostic
### Preventing Vendor Lock-in

This RFC attempts to limit our exposure to singular points of failure or compromise by relying on entities for the totality of our security. A choice was made to utilize Amazon CloudHSM; however, we are capable of moving the entirety of this operation to another infrastructure if needed. Some cases here can be solved by various cloud providers, which our own CA would allow us to remain independent; such as internal mTLS, binary code signing and git commit signatures.
This RFC attempts to limit our exposure to singular points of failure or compromise by relying on entities for the totality of our security. A choice was made to utilize Amazon CloudHSM as a hardware solution for key storage; but we have chosen not to use internal cloud-specific CA mechanisms in this case to avoid being further bound to a single providers ecosystem. The scheme described in this RFC can be moved to other HSM or storage backends with no other dependencies on a specific service.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, migration between different HSM services may face incompatibility of similar key formats and encoding methods, and considering the replacement of keys, the workload of this migration is almost equivalent to re-establishing a new key management mechanism


# Prior art
Expand Down Expand Up @@ -247,15 +256,15 @@ Relevant past RFCs:

**Crate Signing / Mirroring**: A subsequent RFC will specify how we handle crates.io index signing, to enable mirroring and other requirements.

**Code Signing**: This RFC provides a chain of trust that can in the future be used for signing binaries for Apple/Microsoft binary authentication. However, this RFC does not specify a mechanism or process for such signing; establishing that is future work. A subsequent RFC will specify how we handle code signing.
**Code Signing**: This RFC provides a chain of trust that can in the future be used for signing binaries for Apple/Microsoft binary authentication. (This would require generating a CSR for a key chaining to the Rust Root, and then getting that CSR signed by the appropriate Microsoft/Apple process.) However, this RFC does not specify a mechanism or process for such signing; establishing that is future work. A subsequent RFC will specify how we handle code signing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of tying those certificates to the root?

Regardless of which signing scheme we end with for releases, we will need to sign more than just releases (source tarballs, reproducibility artifacts, Windows/Apple installers). So code signed things will be already signed by the release key like any other of our artifacts.

What advantage would tying the code signing key to our root provide? We would be code signing those things just to avoid the operating systems from complaining, and they will not care about our root.


**Git mirroring**: This RFC specifies a delegate key for bors to sign git commits, but does not specify any scheme for mirroring Rust git repositories. Future RFCs will specify how we handle git repository mirroring.

**OSCP/CRL/Transparency Log/Stapling**: Do we wish to set up an OCSP service, CRL, Transparency Logs, and/or perform stapling? Future implementations of these types of services can reside on [pki.rust-lang.org][pki-rust-lang-org], meeting its purpose of providing transparency to our PKI. We leave it to future implementation and RFCs to discuss the use cases of these systems.

**Internal Signing Service**: A microservice endpoint to provide authenticated signing of binary blobs with various keys for roles. This could be implemented in AWS via roles for allowing different teams to perform binary blob signing using their issued keys without disclosure of the keys.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (internal signing service) should be explicitly out of scope and we should not do this. This sentence is describing reimplementing AWS KMS from scratch.


**Infrastructure mTLS**: With a central CA, the infrastructure team will be able to perform certificate validation on clients and servers internally for various use cases. This could include additional security for crater runners, build servers and databases.
**Infrastructure mTLS**: Rust infrastructure that wants to use mTLS could potentially use keys chaining to the Rust Root. This could include additional security for crater runners, build servers and databases.

**The Update Framework**: How can a root CA increase or decrease the complexity of a future TUF implementation? For example, the CA can be used to sign root keys; and validation of root keys could be done via the CA and its OCSP revocation service; this could allow central and easier management of root keys used for the multi-singing of a TUF repository for crates.io.

Expand Down