Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix assignment of quota project IDs #2129

Merged
merged 2 commits into from
Oct 11, 2024

Conversation

mheon
Copy link
Member

@mheon mheon commented Oct 10, 2024

This PR reverts #1921 which completely removed the PROJINHERIT flag from our quota support. That did not actually fix the problem, it just papered over a few of the cracks. The actual problem is that configuring quota support on the system using our recommended commands sets the PROJINHERIT flag on the top-level directory in some (but not all, for unclear reasons?) cases. We set a project ID on the base directory to give us a base value for subsequent ID assignments, but if the base directory has PROJINHERIT that base ID prevents us from setting new IDs on any subdirectories. Strip the flag from the top-level directory if present to resolve.

There is a second problem, in that Podman is choosing the wrong directory to assign quotas to, but that must be fixed in Podman after this merges.

Fixes https://issues.redhat.com/browse/RHEL-18038

mheon added 2 commits October 10, 2024 09:32
This reverts commit f4c8d96.

We do actually require the PROJINHERIT flag for proper operation.
The trick is that we have to remove it on the top-level directory
(which requires having the flag defined). Revert this commit as
such.

Signed-off-by: Matt Heon <mheon@redhat.com>
Basically, PROJINHERIT causes all lower-level directories to get
the same project ID. This is a good thing for the directories
that are supposed to have quotas. It is not a nice thing for the
top-level directory. We set a project ID on that directory so we
know what the base ID is for our subdirectories to use, but we do
not want that ID propagated to subdirectories else everything
will end up using that single quota ID. Stripping the flag from
the top-level directory (if it is present) resolves this.

Partial fix for https://issues.redhat.com/browse/RHEL-18038

Full fix also requires Podman changes as we were setting quotas
on the incorrect directory.

Signed-off-by: Matt Heon <mheon@redhat.com>
@rhatdan
Copy link
Member

rhatdan commented Oct 10, 2024

/approved
LGTM
@giuseppe @nalind @mtrmac @saschagrunert @flouthoc PTAL

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, at least this root cause explanation is consistent with my earlier research in #1921 (comment) .

I’d prefer failing on unexpected setups instead of stripProjectInherit: we either have a bug somewhere else that should be detected and fixed, or we have Podman users with some setups/expectations we are not planning for, and we should find out what they are and discuss how to deal with them.

But this PR seems to be an improvement over the current state either way.

@mheon
Copy link
Member Author

mheon commented Oct 10, 2024

In theory this could be solved by dropping stripProjectInherit and improving setup documentation for quotas. I'm pretty sure that the flag is being set by the XFS quota commands that we recommend running to enable quotas for c/storage (though interestingly, it's not universal, they are sometimes not applied despite running the same setup commands). As such, we could add to the manpages that we recommend a manual chattr -P on the directories in question before using Podman to manually remove the flag and ensure correct operation. That doesn't solve us if something periodically re-adds the flag but I have thus far seen no evidence this is true.

I went with the automatic approach because we do own the directories this is being run on, and we can conclusively say that having project inheritance enabled completely breaks our ability to do quotas. Perhaps we could change this to an error instead of stripping the quota, recommend chattr -P in the error, and also alter the manpages to recommend that so we don't have people immediately get errors after setup?

@mtrmac
Copy link
Collaborator

mtrmac commented Oct 10, 2024

Yes, I was thinking to report an error, and force the user to investigate. But that has the obvious downsides — on balance I’m fine with what this PR is doing.

Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

openshift-ci bot commented Oct 11, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, mheon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mtrmac
Copy link
Collaborator

mtrmac commented Oct 11, 2024

/lgtm

for the tools

@openshift-ci openshift-ci bot added the lgtm label Oct 11, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 9f9b949 into containers:main Oct 11, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants