Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update CI #4839

Merged
merged 2 commits into from
Feb 15, 2024
Merged

update CI #4839

merged 2 commits into from
Feb 15, 2024

Conversation

krissetto
Copy link
Contributor

@krissetto krissetto commented Jan 31, 2024

- What I did

  • Updated gha runners to ubuntu-22.04
  • Updated e2e test matrix to test against 23-dind, 24-dind and 25-dind versions of the engine
  • Small adjustments to some tests because of the engine upgrade (engine output text + cgroups checking)
  • Temporarily skipping a problematic test (until a fix on moby is released)

- How I did it
magic

- How to verify it
CI tests are green

- A picture of a cute animal (not mandatory but encouraged)
TBD

@krissetto krissetto force-pushed the upgrade-ci branch 3 times, most recently from c8c1e9a to 6c10053 Compare February 1, 2024 10:16
@codecov-commenter
Copy link

codecov-commenter commented Feb 1, 2024

Codecov Report

Merging #4839 (b262bba) into master (8bae662) will not change coverage.
Report is 7 commits behind head on master.
The diff coverage is n/a.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #4839   +/-   ##
=======================================
  Coverage   61.32%   61.32%           
=======================================
  Files         287      287           
  Lines       20058    20058           
=======================================
  Hits        12300    12300           
  Misses       6866     6866           
  Partials      892      892           

@krissetto
Copy link
Contributor Author

Commits will be squashed before merging

e2e/plugin/trust_test.go Outdated Show resolved Hide resolved
@krissetto krissetto force-pushed the upgrade-ci branch 2 times, most recently from 60f8559 to bd7cf60 Compare February 1, 2024 15:13
@laurazard
Copy link
Member

(very WIP) PR for the skipped plugin test: moby/moby#47299

@krissetto krissetto marked this pull request as ready for review February 1, 2024 17:18
@krissetto
Copy link
Contributor Author

if merged we should be able to close this as well: #4793

@krissetto krissetto requested a review from a team February 2, 2024 10:33
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! Happy to see that we can make e2e work with updated versions ❤️ (looooong overdue).

Left some comments/suggestions (sorry for the wall of text, hope it helps!)

Also help me remind to create tracking tickets for some of them (at least our plans for integration-cli in moby/moby 😅

e2e/testdata/Dockerfile.connhelper-ssh Show resolved Hide resolved
e2e/testdata/Dockerfile.connhelper-ssh Outdated Show resolved Hide resolved
@@ -21,6 +21,10 @@ import (
const registryPrefix = "registry:5000"

func TestInstallWithContentTrust(t *testing.T) {
// FIXME(krissetto): apparently broken since containerd pull integration (~moby 20).
// remove this skip once the fix is implemented on moby/moby
t.Skip("skipping until a fix is implemented on moby")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure we remember, but where possible, it's good to create a tracking ticket for these, and to add a link.

I've had quite some cases where we added a skip, fixed the issue, but forgot to un-skip the tests. Having a link would help (future self) finding the status of the issue related (oh! it's fixed; we can re-enable!).

In this case, it looks like a fix is (almost) merged;

That fix is not yet in a release, but from that link, it'd be easier to see if it was included in a release (and in which one), so when it's time to "un-skip" the test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix was backported to 23.0, 24.0 and 25.0, so we can probably un-skip this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thaJeztah that one test still crashes the engine, do the dind images need to be manually recreated after these backports?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what version(s) it crashed on? We stopped building packages for 23.0, so for that one, it's probably expected, but we did new releases for 24.0 and 25.0

edit: oh, looks like it's only in the 25.0 release. For 24.0, we merged the fix, but didn't ship (or not yet) moby/moby#47324

I guess we could un-skip for 25.0, but also fine doing in a follow-up if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this PR gets merged first, i'll steal the skip test helper and use it to target moby >= 25 (so skipping APIs older than 1.44 iirc)

Comment on lines 51 to 53
// TODO(krissetto): ugly, remove when no longer testing against moby 24. see https://github.com/moby/moby/pull/46270
8: func(s string) error {
err := output.Contains("Removed intermediate container")(s) // moby >= v25
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I hate it 😂 but think it's a perfectly acceptable solution for now.

Some random thoughts here;

  • This test is for the classic builder. It's ugly, but yeah, we still have to support the classic builder (and it's the only builder supported by the Docker CLI itself so we must have at least some tests)
  • There's still quite some tests in moby/moby's integration-cli test-suite (docker_cli_build_unix_test.go, docker_cli_build_test.go). We need to evaluate those tests, and:
    • In moby/moby: split/move relevant tests to integration/ so that we test the API in moby/moby
    • BUT! There's good value in having an e2e test that includes the CLI (test build using an actual CLI, not just (theoretically) "this API works!")
    • ☝️ but those tests depend on CLI output (which can change), which makes it bad to have them in moby/moby (this is why we pinned the CLI version there to a fixed version)
    • ☝️ so, we should take relevant parts of those tests, and move them over to the CLI repository. In that case, we can adjust the tests if CLI output (etc) changes
    • ☝️ and (ideally) have a way to run those tests against (nightly) builds of Docker Engine
    • ☝️ and (ideally) have a test-matrix to run those tests for multiple versions of the CLI ("did the engine break things for existing (older) versions of the CLI?")

More specific to this test;

  • Technically the progress output should not be considered a stable "API" (so ideally, the test would not break on subtle changes like this), but I don't have a good answer for that one
  • Perhaps we could look if gotest.tools' golden package would work for this; this would allow us to (re)generate output for multiple versions of the engine (and CLI if it changes).
  • ☝️ we need to check though if the golden package is flexible enough (e.g. account for timestamps and such)
  • ☝️ I should include that gotest.tools was created by former colleagues (the primary use-case at the time: to be used in docker's tests). We have a good relation with the gotest.tools maintainers, and the project accepts contributions, and is open to suggestions and improvements (within scope of the project of course: one of the project's goals is to keep things "simple" / "minimal").

Comment on lines 31 to 32
- 24-dind
- 25-dind
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment as well w.r.t. changing the TEST_ENGINE_VERSION / ENGINE_VERSION to be version only.

With those taken into account, I would prefer using 24.0 and 25.0 for these; they're easier to grep / search for when updating versions. If we do, we should probably add quotes (to prevent YAML doing magic? Not sure);

Suggested change
- 24-dind
- 25-dind
- "24.0"
- "25.0"

Given that the CLI currently (is meant to) support API versions older than 24.0 (and Mirantis currently maintaining 23.0 as an LTS version), perhaps we should add an old version to the matrix as well. We could either keep an "oldest version (never updates unless we remove API support)", "latest", and "latest - 1" (e.g. 19.03, 24.0, 25.0), or also include known "lts" versions in it; (e.g. 19.03, 23.0, 24.0, 25.0), which would change to 19.03, 25.0, 26.0 when we have 26.0 released.

Open to suggestions w.r.t. the versions to include though (depends a bit on how long tests take to run, but we could consider a separate (scheduled) job, or a job that only runs on master if we don't want it for every pull request.

Copy link
Contributor Author

@krissetto krissetto Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 23-24-25 as engine versions to test against.

19 needs some extra cgroupv2 testing after updating to ubuntu 22.04 runners

@thaJeztah
Copy link
Member

Oh! Forgot; and GitHub still doesn't allow reviewing commit messages;

Screenshot 2024-02-03 at 13 12 01

Could you:

  • Update your sign-off (and probably git-config) to use your real name, not GitHub handle (see "sign your work") Signed-off-by: Christopher Petito <......>
  • Make the first line of the commit message to be < 72 chars (if possible); try to make it 50 - 72 characters.
  • In the commit body (empty line after first line, rest is considered "body", describe the other changes (could be a bullet-list), 72/80 chars line-length where possible.

Having the short title as first line makes sure some tooling/output doesn't break (GitHub itself also truncates after 72-chars, but other tooling may do similar).

And having a short summary in the body can sometimes help find back changes; for this PR possibly we could break up things into logical commits (but I haven't verified);

  • commit to fix the test for different versions of the engine
  • commit to update the engine versions
  • commit to update the Ubuntu version in GHA (I think this one depends on engine versions being updated for cgroups v2?)

@thaJeztah
Copy link
Member

I need to update the repository settings before CI goes green, because the version of docker changed, and that's what's included in the "required checks" config (SIGH!)

Screenshot 2024-02-03 at 18 12 47

e2e/compose-env.yaml Outdated Show resolved Hide resolved
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, thanks!

There's some comments I left after looking again; they may not be strictly blockers, but let me know if you want to work on those;

  • The commit contains various changes, and not all of them are "equal"; would be nice to split (e.g.) the test-fixes to a separate commit. In this case it's probably "fine", but it's useful to have test-fixes separate (i.e., changed the test, and it still works before updating Ubuntu, and it continues to work after updating Ubuntu). In some of those cases, I also open a separate PR to verify those changes "in isolation", and stack the "update CI" changes on top (I usually rebase those once the first PR is merged).
  • I left a comment for one of the tests; if I understand the change correctly, it's to account for cgroups v1 vs v2. While "v2" is the default on current distros, v1 is not entirely gone (yet), so it would be nice if the test worked on both.
  • I left a comment on the t.Skip - if it still fails even with the latest 25.0.3 engine, something is wrong with the fix. Ideally we would "un-skip" it for 25.0 though (we could use API version for determining if it's v25.0 or older), so that we have the coverage again, but let me know if that's complicated.

docker.Makefile Outdated
@@ -20,7 +20,7 @@ ifeq ($(DOCKER_CLI_GO_BUILD_CACHE),y)
DOCKER_CLI_MOUNTS += -v "$(CACHE_VOLUME_NAME):/root/.cache/go-build"
endif
VERSION = $(shell cat VERSION)
ENVVARS = -e VERSION=$(VERSION) -e GITCOMMIT -e PLATFORM -e TESTFLAGS -e TESTDIRS -e GOOS -e GOARCH -e GOARM -e TEST_ENGINE_VERSION=$(E2E_ENGINE_VERSION)
ENVVARS = -e VERSION=$(VERSION) -e GITCOMMIT -e PLATFORM -e TESTFLAGS -e TESTDIRS -e GOOS -e GOARCH -e GOARM -e ENGINE_VERSION=$(E2E_ENGINE_VERSION)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a follow-up we should consider renaming the E2E_ENGINE_VERSION. If we do, we don't have to assign the value, and instead just set -e ENGINE_VERSION everywhere (which means it's "not set" if no ENGINE_VERSION env-var is set, but if it is, it inherits its value. That's similar to what we did for the TESTFLAGS and other env-vars here.

include:
- target: non-experimental
engine-version: 19.03-dind
# - 19.03 needs a look, doesn't work ubuntu 22.04 (cgroupv2 errors)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, fun. Yeah, didn't think of that, so we can run "either" old engine (plus old ubuntu) or "new" engines.

Yeah, not a show-stopper for now; we could of course put both "ubuntu" and engine-version in the matrix, but let's keep that for a future exercise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc we cannot put runner versions in a matrix, it's not supported by gha, so we'd likely have to create a separate job altogether to have both the ubuntu versions tested

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, let's leave that for a future exercise. Don't think we need to make things too complicated for now. We could add a (todo?) comment to outline why, but not a blocker otherwise.

Comment on lines 32 to 34
- 23.0 # mirantis lts
- 24.0
- 25.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit; perhaps we should add comments to these to outline intent, but that's fine for a follow-up; we should probably also include some of this in a README, so it's fine to do it then.

          - 23.0 # mirantis lts
          - 24.0 # latest - 1
          - 25.0 # latest

Comment on lines 6 to 10
engine:
image: 'docker:${TEST_ENGINE_VERSION:-stable-dind}'
image: 'docker:${ENGINE_VERSION:-25.0}-dind'
privileged: true
command: ['--insecure-registry=registry:5000']
environment:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for a follow-up;

As we're already building some images here, we could consider adding a minimal Dockerfile for this one as well that sets up things (FROM docker:{$ENGINE_VERSION}-dind etc etc).

@@ -21,6 +21,10 @@ import (
const registryPrefix = "registry:5000"

func TestInstallWithContentTrust(t *testing.T) {
// FIXME(krissetto): apparently broken since containerd pull integration (~moby 20).
// remove this skip once the fix is implemented on moby/moby
t.Skip("skipping until a fix is implemented on moby")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what version(s) it crashed on? We stopped building packages for 23.0, so for that one, it's probably expected, but we did new releases for 24.0 and 25.0

edit: oh, looks like it's only in the 25.0 release. For 24.0, we merged the fix, but didn't ship (or not yet) moby/moby#47324

I guess we could un-skip for 25.0, but also fine doing in a follow-up if needed.

Comment on lines 148 to 149
result := icmd.RunCommand("docker", "run", "--cgroupns=private", "--rm", fixtures.AlpineImage,
"/bin/grep", "-q", "':memory:/$'", "/proc/1/cgroup")
"cat", "/sys/fs/cgroup/cgroup.controllers")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this change was to adapt the test for ubuntu 22.04, which uses cgroups v2. I wonder if we can make this conditional based on whether the daemon's host uses cgroups v2 or v1. If we can, it means we can run these tests on different situations (which can also include, e.g. CentOS or RHEL, which may be on older kernels and/or cgroups v1).

docker info should provide information about cgroups being used (either using the CLI, or through the API);

Server:
 ...
 Server Version: 25.0
 Storage Driver: overlayfs
  driver-type: io.containerd.snapshotter.v1
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2

Information is in the CgroupDriver and CgroupVersion fields in the response. Which can also be taken using the CLI with a --format;

docker info --format={{.CgroupDriver}}
cgroupfs

docker info --format={{.CgroupVersion}}
2

Copy link
Contributor Author

@krissetto krissetto Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it doesn't seem to be specific to cgroupv2, but just a different way to verify cgroup functionality is available.

I tested in this draft pr (https://github.com/docker/cli/pull/4867/files) splitting out only the tests as you suggested in the other comments. Tests are green on the older ubuntu (20.04) as well (makes me wonder, is cgroup v1/v2 really the issue here?)

edit: wondering because searching on moby for similar usages of this check they all seem to refer to cgroups v2
https://github.com/search?q=repo%3Amoby%2Fmoby%20%2Fsys%2Ffs%2Fcgroup%2Fcgroup.controllers&type=code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... good one. Perhaps I assumed incorrectly that it was for cgroups v2 (I recalled there were some changes in this area from v1 -> v2, but perhaps I'm wrong and this wasn't one of them 😅) 🤔

@thaJeztah
Copy link
Member

Looks like you pulled in a merge-commit; can you try doing a rebase?

@krissetto
Copy link
Contributor Author

Yeah I noticed 😓, i accidentally pressed the sync branch button on github but i was doing the rebase locally

@thaJeztah
Copy link
Member

- gha runners updated to ubuntu 22.04
- e2e now runs against moby 23.0, 24.0 and 25.0
- temporarily skip broken test for moby < 25

Signed-off-by: Christopher Petito <chrisjpetito@gmail.com>
Signed-off-by: Christopher Petito <chrisjpetito@gmail.com>
@krissetto
Copy link
Contributor Author

@thaJeztah we'll have to remember to change these checks as well. Shall we use the lts+latest as the new e2e tests to check for?

I need to update the repository settings before CI goes green, because the version of docker changed, and that's what's included in the "required checks" config (SIGH!)

Screenshot 2024-02-03 at 18 12 47

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

(🙈 sorry, thought I already LGTM'd 😂)

@thaJeztah thaJeztah merged commit 21d96ff into docker:master Feb 15, 2024
88 checks passed
@thaJeztah
Copy link
Member

I removed the -stable and -19.03 from the "required" checks. We should re-add required checks again at some point, but it's likely there's PR's that still ran the previous matrix, and I don't think restarting CI re-reads the actions from master 😞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants