-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nomad 1.7.7 raw exec task using consul kv put no longer working #20566
Comments
Hi @sirbudd, thanks for reporting an issue! Could you give us more detail about the job you're submitting? Is your server configured with ACLs? |
Hello! The job file contains a simple raw_exec task that runs the above mentioned script, nothing special
I did some more testing and I concluded that all consul kv put/get commands no longer work when executed by the raw_exec driver |
Hi @sirbudd, I'm having a hard time reproducing your issue. The task definition you've pasted won't even get parsed by Nomad, there's invalid syntax inside your template block. I want to help and investigate the issue, but it's hard without more details on how to reproduce this. If you can't provide a full jobspec for some reason, could you provide more details about the error message? Perhaps running Nomad with debug level logging? |
Hello @pkazmierczak This is a Nomad job file that I used for debugging the issue:
With the above job file everything worked without any issues. Turns out what was happening was that the raw_exec was getting killed due to not enough resources.
Turns out that after the Nomad upgrade from 1.6.x to 1.7.7 those resources were not enough anymore. This is why the consul kv put command was getting killed |
Hey @sirbudd, thanks for providing all the detail. It's hard to say whether the resource exhaustion issue is due to newer version of Nomad or perhaps Consul or docker taking more resources to execute, or yet another factor. I'll close the issue for now but please feel free to re-open in case you encounter more problems. |
Thank you for your assistance! |
Nomad version
Nomad 1.7.7
Operating system
Ubuntu 22.04 jammy
Issue
After upgrading to Nomad 1.7.7 from Nomad 1.6.9 a raw_exec task which is a bash script that is using the consul kv put command no longer works.
Reproduction steps
The following bash script is executed by the raw_exec task:
#!/usr/bin/env bash
set -x
while ! docker ps | egrep -i "${DOCKER_CONTAINER}" | egrep -vi "${NON_INCLUDE}"; do sleep 3; done
sleep 3;
DATE=
date +%Y-%m-%d
consul kv put "job_state/ANY_PATH/${NOMAD_ALLOC_ID}" "${DATE}";
while true; do sleep 6000; done
Expected Result
A successful kv put: Success! Data written to: job_state/......
Actual Result
/var/lib/nomad/alloc/9b74628a-f6b9-07b2-898d-e986c227b73d/Consul-Set-Dependency/set_dependency.sh: line 6: 1080306 Killed consul kv put "job_state/ANY_PATH/${NOMAD_ALLOC_ID}" "${DATE}"
Job file (if appropriate)
The raw_exec task worked until upgrading to the latest Nomad version. The only way to have it work again was to downgrade back to 1.6.9.
If I run the bash script by hand there are no issues.
This is the Nomad agent client options config:
"driver.raw_exec.no_cgroups" = "true"
"docker.cleanup.image" = "true"
"docker.volumes.enabled" = "true"
"docker.cleanup.image.delay" = "16h"
"driver.raw_exec.enable" = "1"
"docker.caps.whitelist" = "CHOWN,DAC_OVERRIDE,FSETID,FOWNER,MKNOD,NET_RAW,SETGID,SETUID,SETFCAP,SETPCAP,NET_BIND_SERVICE,SYS_CHROOT,KILL,AUDIT_WRITE,AUDIT_CONTROL,AUDIT_READ,SYS_PTRACE"
Please let me know if I need to provide any further info.
The text was updated successfully, but these errors were encountered: