Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory usage for ClassLoaderHasClassesNamedMatcher #7866

Merged

Conversation

laurit
Copy link
Contributor

@laurit laurit commented Feb 21, 2023

See #7698
This is an attempt to reduce memory usage for ClassLoaderHasClassesNamedMatcher. Instead of having each matcher keep a Map<ClassLoader, Boolean> we can have one Map<ClassLoader, BitSet> where each matcher uses one bit in the BitSet. Alternatively Map<ClassLoader, Set<ClassLoaderHasClassesNamedMatcher>> where set contains matchers that match for given class loader would also work well because these matchers usually don't match so we can expect to have only a few elements in the set.

@laurit laurit force-pushed the ClassLoaderHasClassesNamedMatcher-memory branch from 6f4cf22 to 9d73aff Compare February 22, 2023 08:09
@laurit laurit marked this pull request as ready for review February 27, 2023 15:52
@laurit laurit requested a review from a team as a code owner February 27, 2023 15:52
Copy link
Member

@mateuszrzeszutek mateuszrzeszutek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

@laurit laurit force-pushed the ClassLoaderHasClassesNamedMatcher-memory branch from 1c78f86 to 153b713 Compare March 17, 2023 10:33
@laurit laurit merged commit f5f83fd into open-telemetry:main Mar 17, 2023
@laurit laurit deleted the ClassLoaderHasClassesNamedMatcher-memory branch March 17, 2023 16:43

private final String[] resources;
private final int index = counter.getAndIncrement();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laurit can I suggest you add a bit more documentation in this class? I think the interaction between the index and the bitset are interesting, but not obvious.

Copy link
Contributor

@robododge robododge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I thought it would be a good idea to review this change. This code is very important for the resource consumption and previously a production outage was caused by creation of classloaders in a runaway fashion. This change should lessen that circumstance.

@@ -46,4 +59,41 @@ private boolean hasResources(ClassLoader cl) {
}
return true;
}

Copy link
Contributor

@robododge robododge Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a step in the right direction.

My interpretation: (Please confirm if I'm interpreting this correctly)
So the new Manger is holding a list of all matchers, which correspond to N instrumentations. For each matcher a BitSet of size M will eventually expand out. That is N bitsets of length M being created. Previously, N instrumentations were each creating M cache entries, one entry for every ClassLoader instance encountered. Each entry in the WeakCache creates a weak key, so previously the total weak keys created amounted to N x M. Now, the a single weak key is created for each instrumenation. This is a very good improvement. During working on PR #7710, new instances of the Janino classloader were created in a runaway fashion inside my legacy application, counting for millions of classloader-to-boolean weakkeys. Here, we will end up with just a fixed list of very long BitSets if a classloader storm happens.

My past situation I ended up with 5Mil weak keys and I had about 112 instrumentations active.

Previous
Trace 5Mil classloaders took 5Mil Boolean instances ~ 40MB x 112instumentations ~ 4.5GB (but my heap dumps only consumed 150MB since JDK 8 reuses Boolean instances)

Now
Track 5Mil ClassLoaders with BitSet will be 78KB per instrumentation ~ 78KB x 112 ~ 8Mb MUCH BETTER

(Based my BitSet calculation on internals of a bit set modeled with array of Long[]. 5Mil/64 = 78k Longs)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion you analysis is flawed. It is based on the need to keep track of 5M class loaders, but that situation only arose when cache keys were not removed due to a bug. Given such bug the code in this pr will eventually also fail, it will just take a bit longer.

Here, we will end up with just a fixed list of very long BitSets if a classloader storm happens.

The size of the bit set is based on the count of the matchers not class loaders. This size is deterministic, I think it was a bit over 200. There is a bit set for each class loader.

Trace 5Mil classloaders took 5Mil Boolean instances ~ 40MB x 112instumentations ~ 4.5GB (but my heap dumps only consumed 150MB since JDK 8 reuses Boolean instances)

A new Boolean instance is created only when you call new Boolean(true). Boxing conversion calls Boolean valueOf(boolean b) (didn't verify this) which does not create new instances. I wouldn't expect these Boolean values to cause problems. My guess would be that a large consumer would be the array that backs the map. If you have 5M keys then the backing map would have 8M elements. And even large would be the map keys. Minimum size of a java object is like 16 bytes, with 5M keys that alone would be 80M. In my opinion it is better to build a small demo app that replicates the situation and let Eclipse memory analyzer compute the retained size.

}

private boolean hasResources(ClassLoader cl) {
private static boolean hasResources(ClassLoader cl, String... resources) {
boolean priorValue = InClassLoaderMatcher.getAndSet(true);
try {
for (String resource : resources) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a bug in this for loop, if there is a single resource class that is not managed by the particular ClassLoader, then this will return marking that ClassLoader as never containing any of the instrumentation classes. It may be worth it to think about what if an instrumentation exposes two classes and one of those is managed by the particular ClassLoader, wouldn't you want to mark that ClassLoader as matching?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent is to match only when all the resources are present. See javadoc in

/**
* Matches a class loader that contains all classes that are passed as the {@code classNames}
* parameter. Does not match the bootstrap classpath. Don't use this matcher with classes expected
* to be on the bootstrap.
*
* <p>In the event no class names are passed at all, the matcher will always return {@code true}.
*
* @param classNames list of class names to match.
*/
public static ElementMatcher.Junction<ClassLoader> hasClassesNamed(String... classNames) {
return new ClassLoaderHasClassesNamedMatcher(classNames);
}

If you need to matched based on either class A or B being present you can write something like hasClassesNamed(A).or(hasClassesNamed(B)).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @laurit, the all-or-nothing matching makes sense

@laurit laurit mentioned this pull request Mar 21, 2023
laurit added a commit that referenced this pull request Mar 21, 2023
This was brought up in the review of
#7866
but somehow didn't get removed there 🙈
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants