Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature - Provide docker image sha for failed pinned dependencies #967

Closed
naveensrinivasan opened this issue Sep 3, 2021 · 29 comments · Fixed by #2037
Closed

Feature - Provide docker image sha for failed pinned dependencies #967

naveensrinivasan opened this issue Sep 3, 2021 · 29 comments · Fixed by #2037
Assignees
Labels
good first issue Good for newcomers help wanted Community contributions welcome, maintainers supportive of idea but not a high priority kind/enhancement New feature or request
Projects

Comments

@naveensrinivasan
Copy link
Member

Is your feature request related to a problem? Please describe.
Right now scorecard provides a warning message for Warn: unpinned dependency detected in Dockerfile: ' alpine:3.14 when the docker file does not have a pinned SHA.

Describe the solution you'd like
Instead of just providing the warning it would be helpful to provide the actual SHA that would have to be used to address this issue.
This can be achieved by using https://github.com/google/go-containerregistry/tree/main/cmd/crane and running something like crane digest alpine:3.14 which provides the SHA
sha256:e1c082e3d3c45cccac829840a25941e679c25d438cc8412c2fa221cf1a824e6a

@ristomcgehee
Copy link
Contributor

Seems like a good idea to me.

@azeemshaikh38 azeemshaikh38 added good first issue Good for newcomers help wanted Community contributions welcome, maintainers supportive of idea but not a high priority and removed needs discussion labels Sep 7, 2021
@laurentsimon
Copy link
Contributor

laurentsimon commented Sep 8, 2021

Good idea. Here are additional thoughts:

  1. We have different users that can use this information: humans and tools (e.g. allstar). Human can write a PR with the tip we give them; tools like allstar could create a PR automatically. One way we could handle both is to add some struct in https://github.com/ossf/scorecard/blob/main/checker/check_result.go#L64 like:
type remediation struct {
   Text string // The text message for a human.
   Diff string // a diff of the changes for a tool to apply.
}
  1. I think the fact that we can use crane, which is written in golang is down to luck. Over-time, we will likely want to suggest other tips for other checks. For example, @naveensrinivasan found out how painful it is to fix the Token-Permissions check manually Scorecard repo fails at Token-Permissions with score 0 #942 (comment). A tip or a diff would be useful there too. Unfortunately, many tools won't be written in golang (bringing other complications, e.g. sandboxing, etc). Sometimes, remediation tips may require more work and be stateful, which will be hard to handle in scorecard. We probably need a remediation-as-service, where we could implement more complicated logic server-side and provide a REST API for users (scorecard, allstar) to generate specific tips/diff.

  2. Note that for the specific case of dependency updates/pinning, I opened an issue on dependabot about this Pinning-by-hash upgrade dependabot/dependabot-core#3699. Basically, if we have an option to turn on hash-pinning in dependabot config, it would automatically the first hash in a new PR.

@github-actions
Copy link

Stale issue message

@spencerschrock
Copy link
Contributor

I'd like to pick this feature up and I have a good grasp on how to do this the easy way with crane.

@naveensrinivasan
Copy link
Member Author

@spencerschrock Thank you!

@naveensrinivasan
Copy link
Member Author

I'd like to pick this feature up and I have a good grasp on how to do this the easy way with crane.

The only concern I have with crane is that it brings a lot of dependencies AFAIK. We are already using Docker for parsing Dockerfiles. Could you please check if we can use that?

@spencerschrock
Copy link
Contributor

Sure. I had mentioned crane as it was in your initial comment, but I'll look into using Docker instead.

@naveensrinivasan
Copy link
Member Author

Sure. I had mentioned crane as it was in your initial comment, but I'll look into using Docker instead.

Cool! Thanks

@laurentsimon
Copy link
Contributor

There is also https://github.com/sethvargo/ratchet

@varunsh-coder is there an API you support that we could call?

@varunsh-coder
Copy link
Contributor

There is also https://github.com/sethvargo/ratchet

@varunsh-coder is there an API you support that we could call?

We don't as of now for Dockerfiles, but can easily add it. There is already code that gets SHA for a docker image and tag (for GitHub Action workflows). We can repurpose it for Dockerfiles.

Should this have a UI that you will point to, similar to what we have for workflows?
e.g. in addition to https://app.stepsecurity.io/secureworkflow/microsoft/vscode/ci.yml/main?enable=permissions,pin, we can have something like https://app.stepsecurity.io/securedockerfile/path-to-dockerfile?enable=pin

Or do you plan to call the API directly from scorecard?

@laurentsimon
Copy link
Contributor

Should this have a UI that you will point to, similar to what we have for workflows? e.g. in addition to https://app.stepsecurity.io/secureworkflow/microsoft/vscode/ci.yml/main?enable=permissions,pin, we can have something like https://app.stepsecurity.io/securedockerfile/path-to-dockerfile?enable=pin

Or do you plan to call the API directly from scorecard?

I suppose we could do both. @spencerschrock Wdut?

@spencerschrock
Copy link
Contributor

If I'm understanding the issue correctly, just having an API directly callable from scorecard would be enough to populate the remediation struct you mentioned above. Then scorecard/allstar could use the results to warn/automatically remediate.

Personally, I think the generated diffs would be small enough where they could be included in scorecard's existing results without flooding the output. The UI certainly looks better though. I'd be curious in other peoples' opinions.

@laurentsimon
Copy link
Contributor

laurentsimon commented Jul 7, 2022

If I'm understanding the issue correctly, just having an API directly callable from scorecard would be enough to populate the remediation struct you mentioned above. Then scorecard/allstar could use the results to warn/automatically remediate.

We don't always know if users will run scorecard in some more locked down environment (network disabled besides GitHub, etc), so a backup plan when connection fails would also be useful, I think: first try to get the sha, else provide a link to the StepSecurity website. Wdut?

Personally, I think the generated diffs would be small enough where they could be included in scorecard's existing results without flooding the output. The UI certainly looks better though. I'd be curious in other peoples' opinions.

+1 on remediation.

We have done it for some results in #1850.
This fits in the overall idea of improving the format of the results, see #1874 (comment) (WIP, so feedback welcome!)

@spencerschrock
Copy link
Contributor

I think: first try to get the sha

Exploring the Docker path? Or using the StepSecurity API if it were created?

else provide a link to the StepSecurity website. Wdut?

That seems like a reasonable backup to me. Or a link to the existing remediation markdown if no UI is created (essentially current behavior)

@laurentsimon
Copy link
Contributor

I think: first try to get the sha

Exploring the Docker path? Or using the StepSecurity API if it were created?

@varunsh-coder would you consider an API for us to get the result sha as well?

If this fails, we could fall back to providing the URL like we did for pinned permission check

@varunsh-coder
Copy link
Contributor

I think: first try to get the sha

Exploring the Docker path? Or using the StepSecurity API if it were created?

@varunsh-coder would you consider an API for us to get the result sha as well?

If this fails, we could fall back to providing the URL like we did for pinned permission check

Yes, sure. These are the tracking issues.
step-security/secure-repo#882
step-security/secure-repo#883

I will come back with a draft API definition and we can iterate over it before deciding on the final one.

@naveensrinivasan
Copy link
Member Author

I think: first try to get the sha

Exploring the Docker path? Or using the StepSecurity API if it were created?

@varunsh-coder would you consider an API for us to get the result sha as well?

If this fails, we could fall back to providing the URL like we did for pinned permission check

This could be a classic supply chain issue if StepSecurity API gets attacked then scorecard would be recommending the compromised SHA/ other things.

I don't see a reason to use an external API.

@laurentsimon
Copy link
Contributor

This could be a classic supply chain issue if StepSecurity API gets attacked then scorecard would be recommending the compromised SHA/ other things.

I don't see a reason to use an external API.

By API we meant REST API. We do recommend StepSecurity already, so this would not change the trust, correct?

@naveensrinivasan
Copy link
Member Author

naveensrinivasan commented Jul 8, 2022

Recommending is different from actually providing the solution( providing the SHA).

If a malicious user attacks StepSecurity and if we automate based on that one API it becomes almost like Stuxnet( Supply chain security issue)

Scorecard should be cautious.

What is the specific reason for using the API when docker/crane provides this functionality? I am trying to understand.

@laurentsimon
Copy link
Contributor

laurentsimon commented Jul 8, 2022

Recommending is different from actually providing the solution( providing the SHA).

If a malicious user attacks StepSecurity and if we automate based on that one API it becomes almost like Stuxnet( Supply chain security issue)

Stepping back for a moment. The risk here is a hash is changed. This allows an attacker to perform a rollback attack pointing to a 1) older genuine container or 2) an non-existent image, but not achieve more than that. So an attacker could use this to push a previously vulnerable image that they can exploit or perform a DoS, but not direct RCE. It would say it's low risk. But feel free to disagree.

Scorecard should be cautious.

+1

What is the specific reason for using the API when docker/crane provides this functionality? I am trying to understand.

To be clear, I'm not against using an API from a dependency we already use. The reason I asked about the remote API is because we already provide a link to their solution in the remediation section + it lets us outsource some remediation logic / maintenance and focus on other things Scorecard can do. In this case, remediation is simple and we can do it ourselves, but long-term we will need to spin up / rely on other services to do that for more complicated remediation logic.

I'm totally fine with using an API from a dependency we already use.

@spencerschrock
Copy link
Contributor

spencerschrock commented Jul 11, 2022

I've included a draft PR that shows my current changes. I still need to fix the test I broke, and introduce a new one, but wanted to get some visibility if I'm on the right path for the issue. I also thought remediation/remediations.go seemed like a place I could introduce changes, but I'm not as familiar with where that behavior shows up.

I did experiment with crane and didn't see any additional dependencies brought in, or at least the go.mod and go.sum files didn't change.

One issue I can see is identifying the correct hash for a Dockerfile. There are different hashes for linux/amd64 vs linux/arm64/v8 for example. I suspect crane supports distinguishing between these with their option argument, but I need to investigate if there's anyway to determine the proper platform from the Dockerfile.

@laurentsimon
Copy link
Contributor

laurentsimon commented Jul 11, 2022

I also thought remediation/remediations.go seemed like a place I could introduce changes, but I'm not as familiar with where that behavior shows up.

+1.

Customized remediation is a recent code addition. Currently it's just used sparsely in pinned and permission checks. To make it work:

  1. Add https://github.com/ossf/scorecard/blob/main/checks/pinned_dependencies.go#L44-L47 in the check
  2. Call it in the Warn function, see example https://github.com/ossf/scorecard/blob/main/checks/evaluation/pinned_dependencies.go#L86
  3. It will be automatically reflected in SARIF format, see https://github.com/ossf/scorecard/blob/main/pkg/sarif.go#L305

#1874 (comment) is proposing making the remediation results also available in cron / CLI output in the future

@varunsh-coder
Copy link
Contributor

I've included a draft PR that shows my current changes. I still need to fix the test I broke, and introduce a new one, but wanted to get some visibility if I'm on the right path for the issue. I also thought remediation/remediations.go seemed like a place I could introduce changes, but I'm not as familiar with where that behavior shows up.

I did experiment with crane and didn't see any additional dependencies brought in, or at least the go.mod and go.sum files didn't change.

One issue I can see is identifying the correct hash for a Dockerfile. There are different hashes for linux/amd64 vs linux/arm64/v8 for example. I suspect crane supports distinguishing between these with their option argument, but I need to investigate if there's anyway to determine the proper platform from the Dockerfile.

@spencerschrock this is an interesting issue. I had not thought about it until I looked at your PR.

I am trying to understand this based on a real scenario. If you see this Dockerfile as an example, it is used to build multi architecture images.

This is one of the workflow runs.
https://github.com/madnuttah/unbound-docker/runs/7279847617?check_suite_focus=true#step:8:3

For this, the base image madnuttah/unbound:libevent-buildenv-latest from Dockerfile is being fetched. I think for each different platform build, the corresponding platform image for madnuttah/unbound:libevent-buildenv-latest should be fetched, right?

Do you know what the 92d80bce25c9a2f0660fa7bb66404509ea3c50cdae60e6f21abd8772e2a8139c hash is? It seems to be used across builds for different platforms.
e.g. https://github.com/madnuttah/unbound-docker/runs/7279847617?check_suite_focus=true#step:8:228 and https://github.com/madnuttah/unbound-docker/runs/7279847617?check_suite_focus=true#step:8:337

Interestingly I do not see that hash on DockerHub for this image: https://hub.docker.com/layers/unbound/madnuttah/unbound/openssl-buildenv-latest/images/sha256-2e38fa5026c578fe0f44d90845895e5ec22382b600f99b8f7fddbb877f0ac664?context=explore

@spencerschrock
Copy link
Contributor

spencerschrock commented Jul 14, 2022

TL:DR: My understanding is the same as yours: there is some hash which is platform independent. I think we can just use universal hash crane provides by default as it seems to work on all platforms. I'm not sure where it originates from, as it's not in the manifest (I tried to grep it docker manifest inspect --verbose golang | grep 9349ed889adb906efa5ebc06485fe1b6a12fb265a01c9266a137bb1352565560).

I've been experimenting and observed the following behavior from crane
crane digest golang and crane digest golang --platform all both (currently) return sha256:9349ed889adb906efa5ebc06485fe1b6a12fb265a01c9266a137bb1352565560. This hash is not listed on Docker Hub here

I have the following Dockerfile for an example:

FROM golang AS base
CMD go version

When pinned to golang@sha256:9349ed889adb906efa5ebc06485fe1b6a12fb265a01c9266a137bb1352565560, it successfully builds for all platforms when I try to build forlinux/arm64/v8, linux/amd64 etc:

  • docker build --platform=linux/arm64/v8 -t scorecard-example .
  • docker build --platform=linux/amd64 -t scorecard-example .
  • ...

On the other hand, if I grab a platform specific hash (either from Docker Hub or crane with the appropriate --platform flag), that hash can only be used to pin when building for the proper platform.

For example if I try to use the linux/arm64/v8 digest to build a linux/amd64 image it complains:

image with reference golang@sha256:a083dca7c9d252411e94154be7aa88a2f97f34dd8fa1c05a15cdd39bf6a25523 was found but does not match the specified platform: wanted linux/amd64, actual: linux/arm64/v8

@varunsh-coder
Copy link
Contributor

TL:DR: My understanding is the same as yours: there is some hash which is platform independent. I think we can just use universal hash crane provides by default as it seems to work on all platforms. I'm not sure where it originates from, as it's not in the manifest (I tried to grep it docker manifest inspect --verbose golang | grep 9349ed889adb906efa5ebc06485fe1b6a12fb265a01c9266a137bb1352565560).

I've been experimenting and observed the following behavior from crane crane digest golang and crane digest golang --platform all both (currently) return sha256:9349ed889adb906efa5ebc06485fe1b6a12fb265a01c9266a137bb1352565560. This hash is not listed on Docker Hub here

I have the following Dockerfile for an example:

FROM golang AS base
CMD go version

When pinned to golang@sha256:9349ed889adb906efa5ebc06485fe1b6a12fb265a01c9266a137bb1352565560, it successfully builds for all platforms when I try to build forlinux/arm64/v8, linux/amd64 etc:

  • docker build --platform=linux/arm64/v8 -t scorecard-example .
  • docker build --platform=linux/amd64 -t scorecard-example .
  • ...

On the other hand, if I grab a platform specific hash (either from Docker Hub or crane with the appropriate --platform flag), that hash can only be used to pin when building for the proper platform.

For example if I try to use the linux/arm64/v8 digest to build a linux/amd64 image it complains:

image with reference golang@sha256:a083dca7c9d252411e94154be7aa88a2f97f34dd8fa1c05a15cdd39bf6a25523 was found but does not match the specified platform: wanted linux/amd64, actual: linux/arm64/v8

Thanks @spencerschrock for all the info and the experimentation!

I ran

crane digest madnuttah/unbound:libevent-buildenv-latest

and got the same hash as was in the build log - sha256:92d80bce25c9a2f0660fa7bb66404509ea3c50cdae60e6f21abd8772e2a8139c. So it looks like there is a platform independent hash and it works across platforms.

I am still curious about how this hash is generated. Because we might be missing some edge case and if developers update the Dockerfile based on this recommendation, and it fails, we will not have a good answer. So, I am wondering if we can find some documentation, or someone who can explain about this platform independent hash and how it is calculated, and so on...

@spencerschrock
Copy link
Contributor

spencerschrock commented Jul 14, 2022

So it's due to a Manifest List. It's optional, so I'd be curious to see what crane does for an image without one.
https://docs.docker.com/registry/spec/manifest-v2-2/#manifest-list

the golang image does have one though, here is some abbreviated output

docker buildx imagetools inspect golang
Name:      docker.io/library/golang:latest
MediaType: application/vnd.docker.distribution.manifest.list.v2+json
Digest:    sha256:9349ed889adb906efa5ebc06485fe1b6a12fb265a01c9266a137bb1352565560
           
Manifests: 
  Name:      docker.io/library/golang:latest@sha256:ea3d912d500b1ae0a691b2e53eb8a6345b579d42d7e6a64acca83d274b949740
  MediaType: application/vnd.docker.distribution.manifest.v2+json
  Platform:  linux/amd64
             
  Name:      docker.io/library/golang:latest@sha256:a083dca7c9d252411e94154be7aa88a2f97f34dd8fa1c05a15cdd39bf6a25523
  MediaType: application/vnd.docker.distribution.manifest.v2+json
  Platform:  linux/arm64/v8

The hash is actually passed as a header value, which is why I wasn't seeing it unless I opted for verbose output
crane manifest -v golang

2022/07/14 22:57:00 <-- 200 https://index.docker.io/v2/library/golang/manifests/latest (116.558879ms)
2022/07/14 22:57:00 HTTP/1.1 200 OK
Content-Length: 2355
Content-Type: application/vnd.docker.distribution.manifest.list.v2+json
Date: Thu, 14 Jul 2022 22:57:00 GMT
Docker-Content-Digest: sha256:9349ed889adb906efa5ebc06485fe1b6a12fb265a01c9266a137bb1352565560

Scorecard automation moved this from Backlog to Done Jul 19, 2022
@varunsh-coder
Copy link
Contributor

Wanted to update that the UI for this is available now on app.stepsecurity.io.

Here are couple of examples:
https://app.stepsecurity.io/secure-dockerfile/ossf/fuzz-introspector/Dockerfile/main
https://app.stepsecurity.io/secure-dockerfile/madnuttah/unbound-docker/unbound%2FDockerfile/main

The expected format is:
https://app.stepsecurity.io/secure-dockerfile/:owner/:repo/:pathToDockerfile/:branch

Source code is here:
https://github.com/step-security/secure-workflows/blob/main/securedockerfile.go

If you want, the remediation section in Scorecard can be updated to also include such a link. It can be put below the current remediation suggestion to replace the value with the hash value. The UI might make it easier and help in cases where there are multiple images to be pinned in a single Dockerfile.

@laurentsimon
Copy link
Contributor

cool, do you want to send a PR to update it? This is the file https://github.com/ossf/scorecard/blob/main/remediation/remediations.go

@varunsh-coder
Copy link
Contributor

cool, do you want to send a PR to update it? This is the file https://github.com/ossf/scorecard/blob/main/remediation/remediations.go

Will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Community contributions welcome, maintainers supportive of idea but not a high priority kind/enhancement New feature or request
Projects
Development

Successfully merging a pull request may close this issue.

6 participants