Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(workflows): Disable Go cache for non-test jobs in AWS workflow #9656

Closed
wants to merge 3 commits into from

Conversation

candiduslynx
Copy link
Contributor

@candiduslynx candiduslynx commented Apr 4, 2023

Part of #9388

@candiduslynx candiduslynx self-assigned this Apr 4, 2023
@candiduslynx candiduslynx added the automerge Automatically merge once required checks pass label Apr 4, 2023
Copy link
Contributor

@erezrokah erezrokah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @candiduslynx, let's start by scoping this PR to a single change (the caching one).
Since CI changes can have a significant impact, we should make them as small as possible and test them one by one.

I'll try to test this change later on

.github/workflows/wait_for_required_workflows.yml Outdated Show resolved Hide resolved
.github/workflows/source_aws.yml Outdated Show resolved Hide resolved
.github/workflows/source_aws.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@erezrokah erezrokah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few more comments

.github/workflows/source_aws.yml Outdated Show resolved Hide resolved
.github/workflows/source_aws.yml Outdated Show resolved Hide resolved
@candiduslynx candiduslynx force-pushed the feat/cache/aws branch 2 times, most recently from 48621af to b070e7e Compare April 4, 2023 13:13
Copy link
Contributor

@erezrokah erezrokah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added another comment as it seems setup-go should already cache build artifacts so we should probably research why it doesn't work before opting out the default logic

.github/workflows/source_aws.yml Outdated Show resolved Hide resolved
@candiduslynx candiduslynx force-pushed the feat/cache/aws branch 2 times, most recently from cd93e52 to e0ceb30 Compare April 4, 2023 17:09
@candiduslynx candiduslynx force-pushed the feat/cache/aws branch 3 times, most recently from cb8298e to 57b80ec Compare April 4, 2023 17:33
@candiduslynx
Copy link
Contributor Author

@erezrokah I think this might actually work for the AWS test cache, and then we can fix it in other workflows as well

@candiduslynx candiduslynx changed the title feat(workflows): Use caches for AWS source plugin fix(workflows): Disable Go cache for non-test jobs in AWS workflow Apr 4, 2023
Copy link
Contributor

@erezrokah erezrokah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few more comments. I don't think we should disable the cache for the test policies job.

We should do something like:

      - uses: actions/cache@v3
        with:
          path: |
            ~/.cache/go-build
            ~/go/pkg/mod
          key: ${{ runner.os }}-go-1.19.5-policies-cache-${{ hashFiles('plugins/source/aws/go.sum') }}
          restore-keys: |
            ${{ runner.os }}-go-1.19.5-policies-cache-plugins-source-aws

@@ -112,6 +112,7 @@ jobs:
uses: actions/setup-go@v3
with:
go-version-file: plugins/source/aws/go.mod
cache: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cache: false

This already the default so not needed. We can do it once we update to v4

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to leave this explicitly turned off as it doesn't affect the result now

@@ -153,8 +154,7 @@ jobs:
uses: actions/setup-go@v3
with:
go-version-file: plugins/source/aws/go.mod
cache: true
cache-dependency-path: plugins/source/aws/go.sum
cache: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cache: false

This already the default so not needed. We can do it once we update to v4

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to leave this explicitly turned off as it doesn't affect the result now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the added value of setting a configuration to the default value?

@erezrokah
Copy link
Contributor

Also related actions/setup-go#358

@candiduslynx
Copy link
Contributor Author

candiduslynx commented Apr 5, 2023

Added a few more comments. I don't think we should disable the cache for the test policies job.

I don't think so, too.
However, currently I'd like to skip this in test-policies as it only runs the build.

The proper solution IMO would be to disable setup-go caching and use manual logic for caching:

  • Save only from main after tests and do it regardless of the cache hit (this way we update the cache with the latest test data available)
  • Other jobs (including releases as well) should only perform cache restore

@erezrokah
Copy link
Contributor

The proper solution IMO would be to disable setup-go caching and use manual logic for caching:

  • Save only from main after tests and do it regardless of the cache hit (this way we update the cache with the latest test data available)
  • Other jobs (including releases as well) should only perform cache restore

I think this discussion is out of scope for this PR

However, currently I'd like to skip this in test-policies as it only runs the build.

What the reasoning for not caching test-policies? Won't that make the CI slower as each time we'll do a full build, impacting the performance improvement we're trying to achieve?

@erezrokah
Copy link
Contributor

Also I think once actions/setup-go#358 is resolved we'll revert back to using setup-go caching

@candiduslynx
Copy link
Contributor Author

What the reasoning for not caching test-policies? Won't that make the CI slower as each time we'll do a full build, impacting the performance improvement we're trying to achieve?

We are mostly hit by the testing part, not the build one. So, instead of introducing another cache I'd like to scope caching in setup-go only to the testing job

@erezrokah
Copy link
Contributor

erezrokah commented Apr 5, 2023

It takes ~6m to build AWS without cache:
image

And 8s with cache:
image

The results are from this PR.

Not sure we'd want to give that up

@erezrokah
Copy link
Contributor

We are mostly hit by the testing part, not the build one

I think this is because caching didn't work for tests, and worked for builds

@@ -112,6 +112,7 @@ jobs:
uses: actions/setup-go@v3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
uses: actions/setup-go@v3
uses: erezrokah/setup-go@feat/add_cache_prefix
with:
go-version-file: plugins/source/aws/go.mod
cache-key-prefix: test-cache-

And for policies we can use cache-key-prefix: policies-cache-.

Waiting for actions/setup-go#366 to be reviewed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we might be getting into another issue with this solution:

  • the cache limit is 10 GiB
  • each Go cache is about 100-300 MiB, let's use 150 MiB in the calculations to be on the conservative side (300 MiB likely comes from the huge plugins like AWS or Azure)
  • that means we can store something like 50 caches to be used effectively before they start being evicted
  • we already have 38 source plugins & 21 destination plugins (59 in total, add CLI to it to have 60)
  • If we don't store a single cache but, instead, use 3 different caches, we effectively will store 150 (test) + 100 + 100 (2 x build & mod) = 350 MiB for a plugin (= 21 GB for all plugins + CLI, so we will be evicting a lot of data)

Looking at this, I suggest storing a single cache for plugin, that will include:

  • mod cache
  • build & test cache

To make most of it, we'll have to store it after the testing (and, better yet, from the main branch for it to be properly reused), and to restore in other builds related to a plugin (or CLI).

Copy link
Contributor

@erezrokah erezrokah Apr 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we first fix the bug in our CI, and do any cache optimization later on. Once we fix the bug we should be able so see if we're hitting any cache limits issues, the should be able to do benchmarks and see if that can be improved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the cache limits we can take a look here: https://github.com/cloudquery/cloudquery/actions/caches

I'll be updating the workflow files for the heavy sources (aws, azure, gcp, k8s) to work with go cache properly, we can discuss the different approach separately.

Copy link
Contributor

@erezrokah erezrokah Apr 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have 38 source plugins & 21 destination plugins (59 in total, add CLI to it to have 60)

We only need this fix for 4 plugins, the ones that have policy tests

@candiduslynx
Copy link
Contributor Author

Drop this impl to start anew

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Automatically merge once required checks pass
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

2 participants