Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add cache prefix #358

Open
ernado opened this issue Mar 28, 2023 · 27 comments
Open

feat: add cache prefix #358

ernado opened this issue Mar 28, 2023 · 27 comments
Assignees
Labels
feature request New feature or request to improve the current logic

Comments

@ernado
Copy link

ernado commented Mar 28, 2023

Description:
Add cache key prefix, like that:

- name: Install Go
  uses: actions/setup-go@v4
  with:
    go-version: '1.20'
    cache: true
    cache-prefix: 'test-${{ matrix.arch }}-${{ matrix.flags }}-'

Justification:
Cache from multiple workflows can be mixed.
For example, there can be a workflow that only downloads dependencies.
This will produce cache without .cache/go-build, which will be used by
another workflows, effectively disabling build cache.

This was the cause in #357

Are you willing to submit a PR?

I don't know typescript well, but I'm learning.

Workaround

Use actions/cache@v3 manually, something like that:

name: manual

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    env:
      cache_name: manual

    steps:
      - uses: actions/checkout@v3

      - uses: actions/setup-go@v4
        with:
          go-version: "1.20.x"
          cache: false

      - name: Get Go environment
        run: |
          echo "cache=$(go env GOCACHE)" >> $GITHUB_ENV
          echo "modcache=$(go env GOMODCACHE)" >> $GITHUB_ENV

      - name: Set up cache
        uses: actions/cache@v3
        with:
          path: |
            ${{ env.cache }}
            ${{ env.modcache }}
          key: ${{ env.cache_name }}-${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}-${{ hashFiles('**/go.mod') }}
          restore-keys: |
            ${{ env.cache_name }}-${{ runner.os }}-go-

      - run: go env GOCACHE
      - run: go build -v

Cons
Cache configuration options can possibly lead to feature bloat, so feel free to decline this feature proposal.

@ernado ernado added feature request New feature or request to improve the current logic needs triage labels Mar 28, 2023
@marko-zivic-93
Copy link
Contributor

Hello @ernado
Thank you for sharing this feature request with us. We will take a look at it, and come back to you as soon as we have some info 🙂

@erezrokah
Copy link

Hi 👋 Are you open for a contribution for this issue? We're experiencing it as well and using the suggested workaround (using actions/cache).

@erezrokah
Copy link

Opened #366, feedback appreciated

@lzap
Copy link

lzap commented May 17, 2023

This could help to workaround the problem we have on our project.

@andig
Copy link

andig commented May 30, 2023

I've recently noticed a fair bit of slow-down in our parallel builds from 2:30 to 3:30, see https://github.com/evcc-io/evcc/actions/runs/5124962244 for an example. It shows that both build and test steps- despite stable dependencies- will download a fair bit of packages that should already be there:

fatal: No names found, cannot describe anything.
Version: 8acee30 8acee30 2023-05-30_18:24:01
CGO_ENABLED=0 go build -v -tags=release -ldflags='-X github.com/evcc-io/evcc/server.Version=8acee30 -X github.com/evcc-io/evcc/server.Commit=8acee30 -s -w'
go: downloading github.com/Masterminds/sprig/v3 v3.2.3
go: downloading github.com/andig/gosunspec v0.0.0-20211108155140-af2e73b86e71
go: downloading github.com/bogosj/tesla v1.1.1-0.20230430222423-0d6e63ee90c9
...

It seems likely that this might be a conflict with the parallel clean job that just installs the tools and does some basic tests:

Run make install
fatal: No names found, cannot describe anything.
go install $(go list -f '{{join .Imports " "}}' tools.go)
go: downloading github.com/golang/mock v1.6.0
go: downloading github.com/dmarkham/enumer v1.5.8
go: downloading github.com/pascaldekloe/name v1.0.1
go: downloading golang.org/x/tools v0.9.1
go: downloading golang.org/x/mod v0.10.0
go: downloading golang.org/x/sys v0.8.0

It seems that the situation described in this issue might be quite common.

@andig
Copy link

andig commented May 30, 2023

And here is the result applying this issue's PR: https://github.com/evcc-io/evcc/actions/runs/5126503582. Build time reduced from 3min to 1min at the cost of 3 separate caches. Worth the efforts.

Any chance to get this reviewed/merged?

@dmitry-shibanov
Copy link
Contributor

Hello everyone. We would not like to implement such kind of logic, because the initial caching logic for setup-go should cover the most often use cases. If you need more flexibility with cache, we would like to recommend actions/cache. It's relevant to setup-go ADR.

For now I'm going to close the issue.

@dmitry-shibanov dmitry-shibanov closed this as not planned Won't fix, can't repro, duplicate, stale Jun 20, 2023
@erezrokah
Copy link

Hi @dmitry-shibanov and thanks for the comment. Are you open to adding docs for the cases the caching logic in setup-go doesn't cover? It does seem like a gotcha that should be documented (based on the backlinks to this issue) as the caching is now enabled by default.

Something in the lines of:

Please note that if you use setup-go from multiple jobs (e.g. build and test) you'll need to disable caching and use actions/cache, otherwise caching will not work as expected

@ernado
Copy link
Author

ernado commented Jun 20, 2023

Agree with @erezrokah, adding this to docs (and probably providing a reference workaround + link this issue) would be very helpful.

Also agree with @dmitry-shibanov that this feature should not be implemented, because it is hard to cover all use-cases, so just let user to handle caches manually (like we did in our projects).

@andig
Copy link

andig commented Jun 20, 2023

Plus this needs to cover which folders to cache for Go best practices.

We would not like to implement such kind of logic, because the initial caching logic for setup-go should cover the most often use cases.

This is really unfortunate. The additional logic is minimal and the use is large. It seems quite wide-spread to have build, test, lint jobs parallellized and in all these instances you'll need to separate cache keys.

because it is hard to cover all use-cases, so just let user to handle caches manually (like we did in our projects).

I disagree. The case covered is in the users hands by chosing the right key.

@erezrokah
Copy link

Plus this needs to cover which folders to cache for Go best practices.

And also explain to use the Go version from setup-go outputs in the cache key

@andig
Copy link

andig commented Jun 20, 2023

@erezrokah would you potentially consider rebasing your PR and keeping your fork going forward?

@erezrokah
Copy link

erezrokah commented Jun 20, 2023

@erezrokah would you potentially consider rebasing your PR and keeping your fork going forward?

I'm not sure that would the best solution. If there are docs that users can copy paste and get the same functionality then my PR would be mostly nice to have I think, and won't justify maintaining a fork

@dsame dsame added the investigation The issue is under investigation label Aug 3, 2023
@dsame dsame reopened this Aug 3, 2023
@dsame dsame removed the investigation The issue is under investigation label Aug 3, 2023
@andig
Copy link

andig commented Aug 8, 2023

@dsame thank you for reopening as investigation. Is there anything we could contribute?

@lucacome
Copy link

I removed actions/cache from my workflows when the caching was first implemented and started noticing that it didn't seem to use the cache anymore #130 (comment)

We don't pursue the goal to provide wide customization of caching in scope of actions/setup-go action. The purpose of this integration is covering ~90% of basic use-cases.

I'm not sure how you arrived at that "~90% of basic use-cases" but it seems unlikely to me that 90% of projects (if that's what it means) don't have concurrent jobs running in a pipeline. And since caching now is enabled by default it means it won't work by default. I don't think we're asking for fine-tuning of the caching mechanism here, but just for a way to make the caching work in even more cases.

Whatever that 10% is, still doesn't seem an irrelevant number especially when it can be fixed by a 2-line PR

@dsame
Copy link
Contributor

dsame commented Aug 11, 2023

Hello @andig, @ernado

The issue relates to the feature request for optional build directory caching https://github.com/actions/setup-actions-team/issues/26
and should be resolved with it.

The advantage of (optional) skipping build directory at all seems to be more preferable than having key-prefix because it allows to have multiple build steps in the one job instead of multiple jobs installing the same go version just for sake of getting different caches having the almost same content due to the common dependencies.

Although there might be a case where caching build directory brings significant improving over skipping it. At the moment there's no any discovered, but it would be useful know different opinions to make a final decision

Adding the cache prefix brings significant drawback of using log of storage with multiple caches and hitting the limit very soon, making the caching impossible at all.

Thanks in advance for sharing your thoughts.

@dsame dsame self-assigned this Aug 11, 2023
@andig
Copy link

andig commented Aug 11, 2023

Seems I can't follow the link. Would like to read up on build directory caching to understand the proposal better.

@candiduslynx
Copy link

@dsame

Although there might be a case where caching build directory brings significant improving over skipping it. At the moment there's no any discovered, but it would be useful know different opinions to make a final decision

We want to be able to cache test results as in some cases clean testing would run for ~10 minutes, whilst the cached results will be available in seconds.

@andig
Copy link

andig commented Aug 12, 2023

I'm still failing to understand the proposal of "optional build directory caching". The link in #358 (comment) does not work.

We want to be able to cache test results as in some cases clean testing would run for ~10 minutes

We're also caching:

  • downloaded Go modules
  • linter results which is otherwise very slow

We also have different build steps like generating assets, updating docs and checking repo is in "clean" state that are all parallelised. Some of these steps use Go-ased tools but would not necessarily download modules or fully populate the build cache. Using any of these steps for heavier tasks like build or test has ended up with cache hits on half-populated caches.

@erezrokah
Copy link

erezrokah commented Aug 12, 2023

multiple build steps in the one job instead of multiple jobs

Thanks for sharing @dsame ❤️ Can you maybe share a before/after workflow example with the optional build step so we can understand the proposal better?

Won't that mean that the different steps run in serial as opposed to parallel build jobs? A cache prefix tries to solve using setup-go in parallel build jobs with caching enabled. I'm not sure how putting all steps in a single job can allow running those in parallel.

Please note that at least in our case we also have completely different workflows (not only jobs) that run in parallel, and each workflow needs a distinct cache (for example, one used for testing invoked on pull request triggers, and another used for releasing artifacts invoked on tag push triggers).

Adding the cache prefix brings significant drawback of using log of storage with multiple caches and hitting the limit very soon, making the caching impossible at all.

In our case we treat managing cache limits as a distinct problem to solve (especially as we have a monorepo with many components, each one with a different cache), unrelated to the requirement to have separate caches for jobs.

There are official guides (see https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#force-deleting-cache-entries) on how to manage cache limits, and it might be better to leave the decision on how to manage cache limits to users.

Regardless of cache limits and this issue, I would prefer to have a cache miss than a cache hit on a cache created from an unrelated job, as at least a cache miss will create consistent results.

@dsame
Copy link
Contributor

dsame commented Aug 14, 2023

@andig the complex cases like the one you have described should use @actions/actions-cache , see the samples as starting point

@andig
Copy link

andig commented Oct 14, 2023

@dsame after looking at the samples I still see this an ill-advised tradeoff. We require people to think about cache keys, restore keys, pathes, hashfiles etc instead of providing them the absolutely simple and straight-forward capability of separating their caches per prefix which is a single-line PR.

More so, with any change in actions/cache or the way how Go dependencies should be cached, we would require everyone to update their manual cache settings once more. The fact that even documenting the advanced cases has been ongoing for six months now indicates that this is not straight-forward to get right.

I can only suggest to rethink this decision. I do not object to adding additional docs for advanced use cases, but it would be super handy if usage for semi-simple cases could be simplified.

Update

Let me add one example. a4d10f0 added the OS to the cache key. Everyone using actions/cache would need to follow this best practice manually. It should also make the docs for the advanced use cases. Adding the prefix would implement this for everyone at zero cost.

@andig
Copy link

andig commented Nov 3, 2023

Friendly ping @dsame would appreciate a comment if these concerns are valid?

@jsirianni
Copy link

I think it would be great to have a little more control over the setup-go cache options. I was surprised to see that several of my branches share the same cache, for example.

@macie
Copy link

macie commented Nov 27, 2023

I have similar issue as @jsirianni - different branches share the same cache. When my feature branch starts using first external module, go.sum file was created and cache misses starts to appear on other branches (because they don't have external dependencies/go.sum).

From my (user of setup-go) perspective, cache integration covers only one case: trunk-based development with sequential (non-parallel) jobs. But it isn't clear from the beginning.

GitHub encourages working on feature branches. Even in this repo nearly all pull requests come from feature branches. So I share @lucacome opinion, that current defaults may be unsuitable for far more than 10% users of setup-go.

@relaxdiego
Copy link

relaxdiego commented Feb 17, 2024

For those that don't want to deal with the unnecessary complexity of actions/cache, you can use this as a workaround:

  job_a:
    steps:
      - uses: actions/setup-go@v5
        with:
          cache-dependency-path: |
            "${{ env.WORKING_DIR }}/go.sum"
            .github/.cache/buster-for-job-a

  job_b:
    steps:
      - uses: actions/setup-go@v5
        with:
          cache-dependency-path: |
            "${{ env.WORKING_DIR }}/go.sum"
            .github/.cache/buster-for-job-b

Notice the additional files in cache-dependency-path named .github/.cache/buster-for-job-a and .github/.cache/buster-for-job-b. Make sure those files have totally different contents. They can contain anything as long as they're different. The effect of this is that each job's cache key will have a unique hash suffix.

Another "feature" that this workaround provides is that you can force a rebuild of the job's cache by modifying its respective "cache buster" file.

@andig
Copy link

andig commented Apr 5, 2024

Worth mentioning that #416 is still open almost 6 months later. Splitting caches by build remains a complex task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request to improve the current logic
Projects
None yet
Development

Successfully merging a pull request may close this issue.

13 participants