Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LibOS] VMA bookkeep calls Pal mprotect per VMA #1824

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

llly
Copy link
Contributor

@llly llly commented Mar 21, 2024

Description of the changes

VMA bookkeeping records prot of each VMA. However, mprotect syscall always calls Pal mprotect no matter prot changed or not, because prot recorded in VMA bookkeeping is not used during the syscall.

This PR updates VMA bookkeep mprotect to walk through VMAs and only call Pal mprotect when prot is not same as current prot in corresponding VMAs, and also fixes prot mismatch exceptions in libos_thread.c.

Partial fixes #1708. In future PR, Pal mprotect will take a new parameter old_prot to extend/retstrict protection accordingly.

How to test this PR?

existing tests since all changes are internal.


This change is Reviewable

VMA bookkeeping records prot of each VMA. However `mprotect` syscall always
calls Pal `mprotect` no matter prot changed or not.
This commit updates VMA bookkeep `mprotect` to walk through VMAs and only call
Pal `mprotect` when prot changed, and also fixes prot mismatch exceptions.

Signed-off-by: Li, Xun <xun.li@intel.com>

Signed-off-by: xiaonan-INTC <xun.li@intel.com>
Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @llly)


libos/src/bookkeep/libos_vma.c line 1291 at r1 (raw file):

    int ret = PalVirtualMemoryProtect((void*)vma->begin, vma->end - vma->begin,
                                      LINUX_PROT_TO_PAL(ctx->prot, /*map_flags=*/0));

I think this breaks the whole design of the VMA bookkeeping module. All other bkeep_* functions do only bookkeeping and are supposed to be called before/after the caller did an appropriate PAL API call. Now this single function would work the opposite way.

Copy link
Contributor Author

@llly llly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @mkow)


libos/src/bookkeep/libos_vma.c line 1291 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

I think this breaks the whole design of the VMA bookkeeping module. All other bkeep_* functions do only bookkeeping and are supposed to be called before/after the caller did an appropriate PAL API call. Now this single function would work the opposite way.

The libos_syscall_mprotect logic can be reworked like

while(bkeep_end < end)
    bkeep_mprotect_one_vma(&bkeep_end, &pal_begin, &pal_end, &old_prot)
    PalVirtualMemoryProtect(pal_begin, pal_end)

In other functions, we only mprotect one VMA and don't need the loop.

Copy link
Contributor

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @kailun-qin and @llly)

a discussion (no related file):

In future PR, Pal mprotect will take a new parameter old_prot to extend/retstrict protection accordingly.

Wouldn't it make sense to do smth like this:

  1. Introduce a new field in LibOS VMA data structure that reflects current-environment protections.
  2. Update this field accordingly in bkeep* functions (I guess only in bkeep_mprotect() this field will be different from prot field).
  3. Use a new Pal-to-LibOS callback introduced by @kailun-qin in another EDMM PR, to also provide current-environment permissions.
  4. PAL will call the callback on each VMA, and will see if it needs to apply protections or apply only extend/restrict part.
    • This also has the benefit that Linux PAL may not care about current-environment restrictions. Whereas Linux-SGX PAL will have this special new logic.


libos/src/bookkeep/libos_vma.c line 1291 at r1 (raw file):

Previously, llly (Li Xun) wrote…

The libos_syscall_mprotect logic can be reworked like

while(bkeep_end < end)
    bkeep_mprotect_one_vma(&bkeep_end, &pal_begin, &pal_end, &old_prot)
    PalVirtualMemoryProtect(pal_begin, pal_end)

In other functions, we only mprotect one VMA and don't need the loop.

+1 to @mkow.

The bkeep_* functions are for internal LibOS bookkeeping purposes, and Pal* functions are for external "take-effect" actions.

We do not mix these functions for the main reason: to allow better error handling. Basically, the current VMA subsystem strives to first do all the bookkeeping (first all bkeep_* calls). If some step in the bookkeeping breaks, then LibOS can easily unwind the changes (because they are all internal).

Then the VMA subsystem calls Pal* functions to reflect the internal state of LibOS memory bookkeeping in the Gramine environment (host Linux in case of gramine-direct, SGX enclave in case of gramine-sgx). At the point of first Pal* call, the subsystem basically says "Ok, after that point unwinding the changes will be super-hard or impossible".


I dislike the current rework of the VMA code. It's very different, and it's hard to review.

Why not keep the current implementation of bkeep_mprotect() as is? This will split (if needed) the VMAs, and then we can apply a visitor pattern in libos_syscall_mprotect(), and call corresponding PalVirtualMemoryProtect().

I guess the main problem with the above approach is that at the point of PalVirtualMemoryProtect(), we won't know the previous (= currently applied in the environment) protections? I think we had similar discussions with @kailun-qin during his EDMM work, and we had ideas like keeping the "currently applied" protections in some data structure (e.g. in PAL). Maybe @kailun-qin will suggest something.

Copy link
Contributor Author

@llly llly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @kailun-qin)

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

In future PR, Pal mprotect will take a new parameter old_prot to extend/retstrict protection accordingly.

Wouldn't it make sense to do smth like this:

  1. Introduce a new field in LibOS VMA data structure that reflects current-environment protections.
  2. Update this field accordingly in bkeep* functions (I guess only in bkeep_mprotect() this field will be different from prot field).
  3. Use a new Pal-to-LibOS callback introduced by @kailun-qin in another EDMM PR, to also provide current-environment permissions.
  4. PAL will call the callback on each VMA, and will see if it needs to apply protections or apply only extend/restrict part.
    • This also has the benefit that Linux PAL may not care about current-environment restrictions. Whereas Linux-SGX PAL will have this special new logic.

I'll rework this PR to use callbacks from PAL to LibOS. Since we expect it base on Kailun's EDMM lazy alloc PR, I'll update after Kailun's PR merged.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants