Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memguardian - backpressure #362

Merged
merged 12 commits into from
Mar 13, 2024
Merged

memguardian - backpressure #362

merged 12 commits into from
Mar 13, 2024

Conversation

Mzack9999
Copy link
Member

@Mzack9999 Mzack9999 commented Mar 6, 2024

Adding memguardian component. It's goal is to allow detection of potential OOM kill conditions, and offer the caller the chance to activate mitigation actions to slow down execution.
It can be used both as a standalone struct, or activated and controlled via the following environment variables:

MEMGUARDIAN: Enable/disable. Set to 1 to enable
MEMGUARDIAN_MAX_RAM_RATIO: Maximum ram ratio from 1 to 100
MEMGUARDIAN_MAX_RAM: Maximum amount of RAM (in size units ex: 10gb)
MEMGUARDIAN_INTERVAL: detection interval (with unit ex: 30s)

In case environment variables are used, the component can be accessed through memguardian.DefaultMemGuardian.
The warning state (activated when the used RAM exceedes the defined thresholds) offers two approaches to initiate the back-pressure mechanism:

  • Passive by checking the bool MemGuardianInstance.Warning
  • Active via callback invoked at each duration when the warning state is reached

The caller should insert in hot-paths reduction factors, or for global settings, either use the callback mechanism or check periodically the state and activate/disable mitigation actions.

Closes #361

@Mzack9999 Mzack9999 added the Type: Enhancement Most issues will probably ask for additions or changes. label Mar 6, 2024
@Mzack9999 Mzack9999 self-assigned this Mar 6, 2024
@Mzack9999 Mzack9999 marked this pull request as ready for review March 6, 2024 12:23

// Calculate the system absolute ratio of used RAM vs total available (as of now doesn't consider swap)
func UsedRamRatio() (float64, error) {
vms, err := mem.VirtualMemory()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that the virtual memory is the right metric? Virtual memory can be larger than the physical available memory. How about considering the real memory or physical memory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about retrieving directly via OS system call (eg. sysinfo but the gopsutil seems to do already a decent job about obtaining it with standard system tools/files (for example on linux by parsing /proc/meminfo which according to linux kernel docs represent the avilable effective RAM). Do you think this is correct, or there is any better way on the metrics to monitor (for example swap is actually not considered)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will probably differ per OS:
On MacOS Virtual means something else:
https://apple.stackexchange.com/a/107

Additionally misleading info from gopls:
shirou/gopsutil#119

Windows is probably the same

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the links, I'll have a closer look and see if it makes sense to implement more specific os calls

@stan-threatmate
Copy link

I'd also leave this comment here: would it be better if instead of a memguardian there is a simple memory allocation rate limiter? For example if a buffer can be 10MB we can say at most 50 allocations per second regardless of the concurrency settings. This way if templates don't allocate memory they can be very concurrent while memory heavy templates will be less so. The rate limit can be implemented per template or make it global.

@Mzack9999
Copy link
Member Author

During tests imposing a rate limiting on buffer allocations was leading anyway to OOM kill, just later in time as the rate of threads was anyway piling up.
The backpressure mechanism turned out to be the most effective against the cumulative effect. This is part of a major planned feature involving the implementation of an execution planner (see projectdiscovery/nuclei#4808), where memguardian would act on the planner in order to autotune the speed according to resource availability/constraints.
Anyway that's a great suggestion, I think that we can impose a further rate limit on allocations upon memguardian trigger, so we ensure that memory heavy template are both thread and allocation limited.

@stan-threatmate
Copy link

I can see how per-template rate limiter will still lead to OOM kill if there are enough concurrent templates. A global rate limiter that is tuned based on the amount of available memory at startup might be a good approach. You're basically doing this by controlling the concurrency with memguardian.

Another suggestion is to have a tag for bruteforce templates that we know can take a ton of RAM. Or even a high-mem tag. This way we can exclude all templates that negatively impact the memory. We could also keep memory allocation stats per template execution so we can log it and quickly identify bad templates.

@Mzack9999 Mzack9999 merged commit e3ec80f into main Mar 13, 2024
7 checks passed
@Mzack9999 Mzack9999 deleted the feat-memguardian branch March 13, 2024 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Enhancement Most issues will probably ask for additions or changes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement memguardian helper
2 participants