Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stress-ng fstat test fails with BUG reached PID limit when allocating a new ID, not safe to proceed for RHEL-8 #1759

Open
anjalirai-intel opened this issue Feb 6, 2024 · 3 comments

Comments

@anjalirai-intel
Copy link
Contributor

Description of the problem

stress-ng tests are designed to stress the hardware to detect/identify potential issues.

stress-ng fstat test specially fails for RHEL-8 with BUG reached PID limit when allocating a new ID, not safe to proceed with Gramine-Direct and works for all other distro supported by Gramine

Steps to reproduce

  1. Install stress-ng
  2. Makefile, manifest template, logs has been attached in zip
  3. gramine-direct stress-ng --fstat 0 --timeout 60s -v

stress-ng fstat.zip

Expected results

gramine-direct stress-ng --fstat 0 --timeout 60s -v

stress-ng: info:  [1] setting to a 60 second run per stressor
stress-ng: info:  [1] dispatching hogs: 144 fstat
stress-ng: info:  [1] successful run completed in 60.26s

Actual results

[P1:libos] error: reached PID limit when allocating a new ID, not safe to proceed
[P1:libos] error: BUG() ../libos/src/ipc/libos_ipc_pid.c:72
[P1:libos] error: Illegal instruction during Gramine internal execution at 0x7fffffd19690 (0x7fffffd19690, VMID = 1, TID = 0)
[P134:T782438:stress-ng] error: Could not allocate a tid!
[P128:T572416:stress-ng] error: Could not allocate a tid!
[P133:T782437:stress-ng] error: Could not allocate a tid!
[P126:T572414:stress-ng] error: Could not allocate a tid!
[P114:T572402:stress-ng] error: Could not allocate a tid!
[P86:T309814:stress-ng] error: Could not allocate a tid!
[P72:T309800:stress-ng] error: Could not allocate a tid!
[P71:T309799:stress-ng] error: Could not allocate a tid!
[P69:T309797:stress-ng] error: Could not allocate a tid!
[P68:T309796:stress-ng] error: Could not allocate a tid!
[P59:T76571:stress-ng] error: Could not allocate a tid!
[P53:T76565:stress-ng] error: Could not allocate a tid!
[P57:T76569:stress-ng] error: Could not allocate a tid!
[P45:T76557:stress-ng] error: Could not allocate a tid!
[P47:libos] error: Failed to send IPC msg to 1: Broken pipe (EPIPE)

Gramine commit hash

1cf1f46

@dimakuv
Copy link
Contributor

dimakuv commented Feb 6, 2024

I'm not sure how this is possible, because the limit of possible PIDs is ~4 million:

#define PID_MAX_LIMIT 4194304 /* Linux limit 2^22, this value is *one greater* than max PID */
#define PID_MAX (PID_MAX_LIMIT - 1)

Apparently this sub-test of stress-ng allocate a crazy amount of processes, each with a crazy amount of threads...

@jinengandhi-intel
Copy link
Contributor

The main purpose of raising the bug was to understand why the test would fail only on RHEL but pass on Ubuntu distro. Is there any distro specific parameter that we need to check?

@dimakuv
Copy link
Contributor

dimakuv commented Feb 6, 2024

Good question. This would need further debugging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants