-
-
Notifications
You must be signed in to change notification settings - Fork 862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce CpuStat.getSystemCpuLoadTicks memory allocation pressure #2605
Conversation
Add a new method, FileUtil.readLines, which accepts a number of lines to read and allocates an ArrayList which contains a number of elements corresponding to the number of lines to be read. If file is not readable, return Collections.emptyList() to avoid allocating a list. Use this new method in CpuStat.getSystemCpuLoadTicks to avoid allocating a String for each line /proc/stat, instead allocating a single String and ArrayList with a capacity of one. This reduces allocation of byte[]'s to back the String for each line by two orders of magnitude (from a few kB to a few tens of bytes) on each call.
On my system,
However, the first line
I noticed while profiling one of our applications with Java Flight Recorder and Java Mission Control that @dbwiddis I tried to mirror the existing code style while keeping the size of this change as minimal as possible, but let me know if you'd like to see a different approach. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2605 +/- ##
==========================================
- Coverage 87.05% 86.88% -0.18%
==========================================
Files 30 30
Lines 1313 1334 +21
Branches 183 189 +6
==========================================
+ Hits 1143 1159 +16
- Misses 98 102 +4
- Partials 72 73 +1 ☔ View full report in Codecov by Sentry. |
Hey @chrisribble thanks a ton for this PR! Given that CPU ticks are a "hot path" type operation I'm more than happy to try to optimize here.
A lot of this file was written in ~2015 with JDK 6 compatibility in mind. We've been JDK8 minimum for a while, and I wonder if you can't do even better here using
It may be no better than what you've done, but as long as we're making an improvement, would be nice to compare these options. |
@dbwiddis do you just want to explore updating the implementation here to use I wrote a simple JMH benchmark (and enabled the
Benchmark results:
So the implementation in the PR executes in the least time and allocates the least amount of memory (although the fact that 25k is allocated is definitely something I want to attack in another PR ... probably due to the default 8k buffer size used by the JRE). I'm happy to change it to use But perhaps you actually were wanting to explore returning One thing to consider if we return a For example: // Caller has to close this `Stream` (i.e. using TWR pattern)
public static Stream<String> readLines(final String filename, final boolean reportError) {
Path file = Paths.get(filename);
if (Files.isReadable(file)) {
if (LOG.isDebugEnabled()) {
LOG.debug(READING_LOG, filename);
}
try {
return Files.lines(file);
} catch (IOException | UncheckedIOException e) {
// Does this exception handling make sense if the caller has to do it too on close/terminal op?
if (reportError) {
LOG.error("Error reading file {}. {}", filename, e.getMessage());
} else {
LOG.debug("Error reading file {}. {}", filename, e.getMessage());
}
}
} else if (reportError) {
LOG.warn("File not found or not readable: {}", filename);
}
return Stream.empty();
} |
Thanks for doing the benchmarks! I'm totally fine with merging this PR as-is for this benefit. Goes to prove the "write dumb code" theory that the JIT compiler knows just what to do with this "low level" code. My thought on the stream APIs is to just get rid of this util class altogether; it existed in a time before those APIs existed (pretty sure Again, not really worth it unless it gains us benefit. Can you add a change log entry and do a quick |
Sounds good. I agree that eliminating the utility class is probably better in the long-run. I added the changelog entry (hopefully I did it right) and checked in the results of Btw, I also experimented with specifying a buffer size of 100 bytes and that yields another nice speedup and reduces memory allocation rates by quite a lot (down to just over 9k per execution):
I think that just makes the API more obtuse, though, since the caller (the place where you know roughly how much data you're going to read) would need to specify the buffer size. I'm happy to push those changes in a separate PR where we can discuss them or maybe think some more about a better way to expose "read M bytes from N lines in file foo". Or maybe it makes sense to think about just trying to pick a default buffer size that aligns better with the average size of files that Oshi is reading? The default JRE size of 8k is quite large for most things in |
How about opening an issue to discuss, so it's easier to search for later? Also take a look at the Auxv class (which reads |
One other idea. Old school, but why can't we just go old school and use |
Add a new method, FileUtil.readLines, which accepts a number of lines to read and allocates an ArrayList which contains a number of elements corresponding to the number of lines to be read. If file is not readable, return Collections.emptyList() to avoid allocating a list.
Use this new method in CpuStat.getSystemCpuLoadTicks to avoid allocating a String for each line in /proc/stat, instead allocating a single String and ArrayList with a capacity of one.
This reduces allocation of char[]'s to back the String for each line by two orders of magnitude (from a few kB to a few tens of bytes) on each call.