Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JvmGcMetricsTest#sizeMetricsNotSetToZero is failing on CI with generational ZGC #4497

Closed
shakuzen opened this issue Dec 19, 2023 · 4 comments
Labels
bug A general bug
Milestone

Comments

@shakuzen
Copy link
Member

shakuzen commented Dec 19, 2023

See for example https://ge.micrometer.io/s/bcf466z2dbkke
When I tried to reproduce this locally, I could not: https://ge.micrometer.io/s/pzp56wcywsmrg/

Failing command: ./gradlew zgcGenerationalTest

> Task :micrometer-core:zgcGenerationalTest

JvmGcMetricsTest > sizeMetricsNotSetToZero() FAILED
    java.lang.AssertionError: 
    Expecting actual:
      0.0
    to be greater than:
      0.0
        at io.micrometer.core.instrument.binder.jvm.JvmGcMetricsTest.sizeMetricsNotSetToZero(JvmGcMetricsTest.java:173)

6 tests completed, 1 failed, 1 skipped
@shakuzen
Copy link
Member Author

@shakuzen
Copy link
Member Author

shakuzen commented Jan 15, 2024

This continues to be flaky, but I still do not understand the conditions in which it is happening.

Edit: latest failure https://app.circleci.com/pipelines/github/micrometer-metrics/micrometer/5583/workflows/b23e3578-1e22-4f59-8907-cfa7df952150/jobs/28483

@shakuzen shakuzen reopened this Jan 15, 2024
@jonatan-ivanov jonatan-ivanov added the bug A general bug label Jan 18, 2024
@jonatan-ivanov jonatan-ivanov added this to the 1.9.18 milestone Jan 18, 2024
@jonatan-ivanov
Copy link
Member

I was able to repro this locally by killing the Gradle daemons (see later) and enable gc logs for the test task.

I added some "syso debugging" and the interesting part is that the value we get from the notification, set to the state object and verify in the test is almost always zero.
There is one case when it is non-zero:

name: ZGC Major Cycles	
cause: System.gc()	
action: end of GC cycle	
before: 0	
after (set): 54525952	
after < before: false	
shouldUpdateDataSizeMetrics: true	

but then another notification comes after it that overwrites this value:

name: ZGC Major Cycles	
cause: Warmup	
action: end of GC cycle	
before: 0	
after (set): 0	
after < before: false	
shouldUpdateDataSizeMetrics: true

I'm not sure what "Warmup" exactly means for ZGC but I think if the live data after GC size in the GC notification is zero, we should ignore these notifications and not update the state object of our Gauge. See in 8fe8842.

Repro steps

Modify the zgcGenerationalTest task in gradle and enable GC logs and preserve stdout:

task zgcGenerationalTest(type: Test) {
    // set heap size for the test JVM(s)
    maxHeapSize = "1500m"

    useJUnitPlatform {
        includeTags 'gc'
    }

    jvmArgs '-XX:+UseZGC', '-XX:+ZGenerational', '-Xlog:gc'

    testLogging {
        outputs.upToDateWhen { false }
        showStandardStreams = true
    }
}

Then kill the Gradle daemons since it might happen that if the JVM is reused between test executions, ZGC is already "warm" so it will not emit those GC notifications that might cause problems.
After that run the single test:

./gradlew clean zgcGenerationalTest --tests '*JvmGcMetricsTest.sizeMetricsNotSetToZero'  --rerun-tasks --no-build-cache

@jonatan-ivanov
Copy link
Member

Reopening to narrow down the fix for only ZGC warmup notifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A general bug
Projects
None yet
Development

No branches or pull requests

2 participants