Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use CPU MaxSpeed for cpu_total_compute #20608

Closed
revverse opened this issue May 16, 2024 · 2 comments
Closed

Use CPU MaxSpeed for cpu_total_compute #20608

revverse opened this issue May 16, 2024 · 2 comments

Comments

@revverse
Copy link

revverse commented May 16, 2024

In old versions of the Nomad ( before #18146 ) cpu_total_compute was calculated as a result of CPU max_frequency from file 'cpuinfo_max_freq' in Linux
cpuMaxFile = sysRoot + "/cpu/cpu%d/cpufreq/cpuinfo_max_freq"
or used BaseSpeed if the cpuMaxFile doesn't exist
In the current version, Nomad always uses BaseSpeed for 'cpu.totalcompute'

# cat /sys/devices/system/cpu/cpu1/cpufreq/base_frequency
2000000
# cat /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_max_freq
3800000

and Client page in Nomad shows:

cpu.arch amd64
cpu.frequency 2000
cpu.modelname Intel(R) Xeon(R) Gold 5418Y
cpu.numcores 96
cpu.reservablecores 96
cpu.totalcompute 192000
cpu.usablecompute 192000

I'm not an expert in Golang but maybe the main issue in this part of code

func (c Core) MHz() hw.MHz {
	switch {
	case c.BaseSpeed > 0:
		return c.BaseSpeed
	case c.MaxSpeed > 0:
		return c.MaxSpeed
	}
	return c.GuessSpeed
}

When BaseSpeed is set, MaxSpeed will never used
So I change it to

func (c Core) MHz() hw.MHz {
	switch {
	case c.MaxSpeed > 0:
		return c.MaxSpeed
	case c.BaseSpeed > 0:
		return c.BaseSpeed
	}
	return c.GuessSpeed
}

And Nomad start using MaxSpeed

cpu.frequency 3800

Maybe we should use this behavior as default?

Commit: revverse@dedc5ec
UPD: Affected versions - 1.7.x

@shoenig
Copy link
Member

shoenig commented May 20, 2024

Using the "max speed" is nonsensical - Intel and AMD both set this value to a speed that is achievable by a single core maybe for a few milliseconds. In contrast the base speed is highest speed sustainable by all cores indefinitely.

If you want to override the amount of compute available you can do so using
https://developer.hashicorp.com/nomad/docs/configuration/client#cpu_total_compute

@shoenig shoenig closed this as completed May 20, 2024
Nomad - Community Issues Triage automation moved this from Needs Triage to Done May 20, 2024
@revverse
Copy link
Author

Ok. So, what is the reason to use MaxSpeed in the Nomad codebase?
Before version 1.7, Nomad used MaxSpeed as default, now you say that is BAD, but there are a lot of places where MaxSpeed is calculated and used. Maybe need just to remove this code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants