Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: ScrapeConfigs Selection Issue Across Different Namespaces #6390

Merged
merged 10 commits into from Mar 25, 2024

Conversation

mviswanathsai
Copy link
Contributor

@mviswanathsai mviswanathsai commented Mar 13, 2024

Description

The ScrapeConfigs added in a different namespace from the Prometheus CR that selects them are not being picked up. We suspect it might be because of how the enqueueForNamespace() function works. This PR aims to prove that this is indeed the issue and consequentially also fix it.

Closes #6363

Type of change

What type of changes does your code introduce to the Prometheus operator? Put an x in the box that apply.

  • CHANGE (fix or feature that would cause existing functionality to not work as expected)
  • FEATURE (non-breaking change which adds functionality)
  • BUGFIX (non-breaking change which fixes an issue)
  • ENHANCEMENT (non-breaking change which improves existing functionality)
  • NONE (if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)

Verification

Tests added to main_test.go

Changelog entry

Please put a one-line changelog entry below. This will be copied to the changelog file during the release process.

Fix ScrapeConfig not picked up bug

@mviswanathsai
Copy link
Contributor Author

Requesting review @simonpasquier @slashpai @ArthurSens. The test has been added and it does indeed fail, though I am still not 100% sure of the test. Please do look into it and let me know!

Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start!

test/e2e/scrapeconfig_test.go Show resolved Hide resolved
test/e2e/scrapeconfig_test.go Show resolved Hide resolved
@pull-request-size pull-request-size bot added size/L and removed size/M labels Mar 14, 2024
@mviswanathsai
Copy link
Contributor Author

mviswanathsai commented Mar 14, 2024

I am still not sure about the test. I ran

make image

for n in "prometheus-operator" "prometheus-config-reloader" 
"admission-webhook"; do kind load docker-image 
"quay.io/prometheus-operator/$n:$(git rev-parse --short HEAD)"; done;

as suggested by JP, but the e2e test still failed. I am not able to zero down on whether it is the fix is insufficient or if it might be something with where the governing svc is created or something else about the test itself. So I have made the push to the branch as suggested by Simon to see what happens of the tests.

_, err := framework.CreateOrUpdatePrometheusOperator(
context.Background(),
promns,
[]string{promns},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the operator doesn't watch for the scrape config namespace.

Suggested change
[]string{promns},
[]string{scns},

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspected that too, but this too does not yield the expected result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I just changed a few things now (including this suggestion), and the test is passing.
Let me quickly remove/add a few parameters and see what made the tests fail earlier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings:

  1. The tests failed after letting Operator check the scns because I was looking for targets in scns in my local code.
  2. The test failed when I set the label selector to nil value.
  3. Setting both the right label selectors for labels and namespace passed the test.

@mviswanathsai
Copy link
Contributor Author

mviswanathsai commented Mar 20, 2024

I am not sure why this test is failing (PromRulesExceedingConfigMapLimit). It seems to be passing locally.

        --- PASS: TestAllNS/y/PromRulesExceedingConfigMapLimit (92.24s)

@mviswanathsai
Copy link
Contributor Author

Similarly, the intended test also passes:

    --- PASS: TestPromInstanceNs/ScrapeConfigLifecycleInDifferentNs (493.00s)

@mviswanathsai mviswanathsai changed the title Fix: Fixing ScrapeConfigs Selection Issue Across Different Namespaces Fix: ScrapeConfigs Selection Issue Across Different Namespaces Mar 20, 2024
@mviswanathsai
Copy link
Contributor Author

@simonpasquier @slashpai, All the tests are passing locally. I cannot see why it is failing in the CI.

@slashpai
Copy link
Contributor

Likely flaky trying to re-run this, will see if it still fails

@mviswanathsai mviswanathsai marked this pull request as ready for review March 25, 2024 11:00
@mviswanathsai mviswanathsai requested a review from a team as a code owner March 25, 2024 11:00
Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@simonpasquier simonpasquier merged commit 58c91f7 into prometheus-operator:main Mar 25, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ScrapeConfig not applied when deployed via CRD
3 participants