Optimize Resource Sharing Across Exporters with Arc Implementation #1526

lalitb · 2024-02-12T17:17:53Z

Changes

This PR introduces a refactor of the Resource structure to improve the efficiency of resource sharing across exporters for owned resource attributes. By encapsulating the Resource data within a ResourceInner struct and wrapping it with an Arc (Atomic Reference Counting), we eliminate the need for deep copies of resource data, thereby reducing memory overhead and improving performance.

Existing

#[derive(Clone, Debug, PartialEq)]
pub struct Resource {
    attrs: HashMap<Key, Value>,
    schema_url: Option<Cow<'static, str>>,
}

PR

/// Inner structure of `Resource` holding the actual data.
/// This structure is designed to be shared among `Resource` instances via `Arc`.
#[derive(Debug, Clone, PartialEq)]
struct ResourceInner {
    attrs: HashMap<Key, Value>,
    schema_url: Option<Cow<'static, str>>,
}

/// An immutable representation of the entity producing telemetry as attributes.
/// Utilizes `Arc` for efficient sharing and cloning.
#[derive(Clone, Debug, PartialEq)]
pub struct Resource {
    inner: Arc<ResourceInner>,
}

The stress test results (default has 4 resource attributes) for the main and PR branch. The results are more pronounced if the resource is added with LoggerProviderBuilder::with_resource and TracerProviderBuilder::with_resource methods as indicated here, as the resource created this way are owned, and can't be optimized by the compiler.

Spans:

main

Number of threads: 10
Throughput: 8,584,000 iterations/sec
Throughput: 8,560,000 iterations/sec
Throughput: 8,793,800 iterations/sec
Throughput: 8,465,200 iterations/sec

PR

Number of threads: 10
Throughput: 9,059,800 iterations/sec
Throughput: 8,719,200 iterations/sec
Throughput: 8,831,400 iterations/sec
Throughput: 8,767,800 iterations/sec

logs

main

Number of threads: 10
Throughput: 23,298,600 iterations/sec
Throughput: 21,358,600 iterations/sec
Throughput: 21,797,000 iterations/sec

PR

Number of threads: 10
Throughput: 25,054,600 iterations/sec
Throughput: 24,265,000 iterations/sec
Throughput: 24,576,600 iterations/sec
Throughput: 23,008,000 iterations/sec


## Merge requirement checklist

* [x] [CONTRIBUTING](https://github.com/open-telemetry/opentelemetry-rust/blob/main/CONTRIBUTING.md) guidelines followed
* [x] Unit tests added/updated (if applicable)
* [ ] Appropriate `CHANGELOG.md` files updated for non-trivial, user-facing changes
* [x] Changes in public API reviewed (if applicable)

codecov · 2024-02-12T17:27:38Z

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (650904f) 63.2% compared to head (79d1896) 63.3%.

Files	Patch %	Lines
opentelemetry-sdk/src/resource/mod.rs	98.7%	1 Missing ⚠️

Additional details and impacted files

@@          Coverage Diff          @@
##            main   #1526   +/-   ##
=====================================
  Coverage   63.2%   63.3%           
=====================================
  Files        144     144           
  Lines      20299   20315   +16     
=====================================
+ Hits       12847   12863   +16     
  Misses      7452    7452

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cijothomas · 2024-02-12T19:24:41Z

Can you also add benchmarks to logs, so we can how much this would improve the common use case (like 5-10 Resource attributes)?
(Would be good to share the stress test for the 5-10 attributes, to see how much this impacts normal usages.)

lalitb · 2024-02-12T22:38:06Z

Just to add, the stress test uses NoOpLogProcessor, so the performance is measured for creating the LogData structure for each event (which involves copying of resources in existing design), and not processing by the exporter.

(Would be good to share the stress test for the 5-10 attributes, to see how much this impacts normal usages.)

With PR, the throughput doesn't change as we increase the number of attributes, as we are just doing shared resource increment and no real data copy. However, with existing code, the throughput reduces drastically as we increase the resource attributes.

5 resource attributes:

main:
Number of threads: 8
Throughput: 12,364,800 iterations/sec
Throughput: 12,177,400 iterations/sec
Throughput: 12,345,800 iterations/sec

PR:
Number of threads: 8
Throughput: 22,066,200 iterations/sec
Throughput: 22,142,600 iterations/sec
Throughput: 21,524,400 iterations/sec

10 resource attributes:
main:
Number of threads: 8
Throughput: 8,696,200 iterations/sec
Throughput: 8,750,200 iterations/sec
Throughput: 8,767,400 iterations/sec
Throughput: 8,888,200 iterations/sec

PR:
Number of threads: 8
Throughput: 22,376,000 iterations/sec
Throughput: 21,866,200 iterations/sec
Throughput: 22,334,200 iterations/sec

100 resource attributes:
main:
Number of threads: 8
Throughput: 1,357,600 iterations/sec
Throughput: 1,324,000 iterations/sec
Throughput: 1,353,000 iterations/sec

PR:
Number of threads: 8
Throughput: 22,715,400 iterations/sec
Throughput: 21,450,800 iterations/sec
Throughput: 22,922,000 iterations/sec

Can you also add benchmarks to logs,

Modified the benchmark to include attributes in common scenario:

                KeyValue::new("service.name", "my-service"),
                KeyValue::new("service.version", "1.0.0"),
                KeyValue::new("service.environment", "production"),
                KeyValue::new("service.instance.id", "1234"),

The results:
main:

simple-log/no-context time: [254.13 ns 255.45 ns 257.16 ns]
simple-log/with-context time: [251.62 ns 252.16 ns 252.76 ns]
simple-log-with-int/no-context time: [299.27 ns 300.59 ns 302.25 ns]
simple-log-with-int/with-context time: [302.25 ns 302.83 ns 303.50 ns]

PR:

simple-log/no-context time: [167.73 ns 168.10 ns 168.54 ns]
simple-log/with-context time: [162.33 ns 162.68 ns 163.13 ns]
simple-log-with-int/no-context time: time: [224.12 ns 225.28 ns 226.71 ns]
simple-log-with-int/with-context: time: [218.04 ns 218.28 ns 218.55 ns]

opentelemetry-sdk/benches/log.rs

opentelemetry-sdk/src/resource/mod.rs

hdost

Excellent gains ☺️ thanks again.

hdost · 2024-02-13T13:14:06Z

opentelemetry-sdk/src/resource/mod.rs

-                resource.attrs.insert(key, value);
+                // This call ensures that if the Arc is not uniquely owned,
+                // the data is cloned before modification, preserving safety.
+                // If the Arc is uniquely owned, it simply returns a mutable reference to the data.


This is a constructor, at what point would the inner not be uniquely owned?
That being said is there a reason we shouldn't put the make_mut before the loop?

I believe it would be safe to move make_mut before loop at this point. The inner is ref-incremented only during LogData/SpanData creation.

hdost

Thank you for the work again!

TommyCpp

Great performance improvements!

opentelemetry-sdk/benches/log.rs

opentelemetry-sdk/src/resource/mod.rs

cijothomas

LGTM.
Before merge, I'd suggest to add changelog entry to let users know of the perf gains.
Also, please update PR description with before/after numbers.

opentelemetry-sdk/CHANGELOG.md

lalitb added 5 commits February 11, 2024 18:33

more changes

Verified

This commit was signed with the committer’s verified signature.

Josh-Cena Joshua Chen

GPG key ID: C37145B818BDB68F

Verified
Learn about vigilant mode

a5cfef8

remove owned IntoIter for Resource

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.

Learn about vigilant mode

14eaea0

fix unit test

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

ce05eea

fix lint

74302ef

fix clippy

Loading
Loading status checks…

bf58b9d

lalitb requested a review from a team February 12, 2024 17:17

Merge branch 'main' into resource-share

Loading
Loading status checks…

0abc64f

add resource attributes to benchmark

Loading
Loading status checks…

8fd9f7b

cijothomas reviewed Feb 13, 2024

View reviewed changes

opentelemetry-sdk/benches/log.rs Outdated Show resolved Hide resolved

cijothomas reviewed Feb 13, 2024

View reviewed changes

opentelemetry-sdk/src/resource/mod.rs Show resolved Hide resolved

hdost reviewed Feb 13, 2024

View reviewed changes

move make_mut before loop

Loading
Loading status checks…

9168ec9

hdost approved these changes Feb 14, 2024

View reviewed changes

TommyCpp approved these changes Feb 14, 2024

View reviewed changes

cijothomas reviewed Feb 14, 2024

View reviewed changes

opentelemetry-sdk/benches/log.rs Outdated Show resolved Hide resolved

cijothomas reviewed Feb 14, 2024

View reviewed changes

opentelemetry-sdk/src/resource/mod.rs Show resolved Hide resolved

cijothomas approved these changes Feb 14, 2024

View reviewed changes

lalitb and others added 9 commits February 14, 2024 12:42

Merge branch 'main' into resource-share

Loading
Loading status checks…

dd1a311

Merge branch 'main' into resource-share

Loading
Loading status checks…

114153f

add changelog

Loading
Loading status checks…

6b57068

review comment

Loading
Loading status checks…

cc50ea4

leftover commit

Loading
Loading status checks…

09def74

more leftover commit

Loading
Loading status checks…

d892cdc

more leftover commit..last

Loading
Loading status checks…

c1471b3

Merge branch 'main' into resource-share

Loading
Loading status checks…

a10cf5d

Merge branch 'main' into resource-share

Loading
Loading status checks…

b0cdd57

cijothomas reviewed Feb 15, 2024

View reviewed changes

opentelemetry-sdk/CHANGELOG.md Outdated Show resolved Hide resolved

Update opentelemetry-sdk/CHANGELOG.md

Loading
Loading status checks…

79d1896

cijothomas merged commit 476f2c1 into open-telemetry:main Feb 15, 2024
14 checks passed

lalitb mentioned this pull request Mar 28, 2024

[Logs SDK] Send resource once to processor and exporter, and not for every event. #1636

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize Resource Sharing Across Exporters with Arc Implementation #1526

Optimize Resource Sharing Across Exporters with Arc Implementation #1526

lalitb commented Feb 12, 2024 •

edited

Loading

codecov bot commented Feb 12, 2024 •

edited

Loading

cijothomas commented Feb 12, 2024

lalitb commented Feb 12, 2024 •

edited

Loading

hdost left a comment

hdost Feb 13, 2024

lalitb Feb 13, 2024

hdost left a comment

TommyCpp left a comment

cijothomas left a comment

Optimize Resource Sharing Across Exporters with Arc Implementation #1526

Optimize Resource Sharing Across Exporters with Arc Implementation #1526

Conversation

lalitb commented Feb 12, 2024 • edited Loading

Changes

Existing

PR

Spans:

main

PR

logs

main

PR

codecov bot commented Feb 12, 2024 • edited Loading

Codecov Report

cijothomas commented Feb 12, 2024

lalitb commented Feb 12, 2024 • edited Loading

hdost left a comment

Choose a reason for hiding this comment

hdost Feb 13, 2024

Choose a reason for hiding this comment

lalitb Feb 13, 2024

Choose a reason for hiding this comment

hdost left a comment

Choose a reason for hiding this comment

TommyCpp left a comment

Choose a reason for hiding this comment

cijothomas left a comment

Choose a reason for hiding this comment

lalitb commented Feb 12, 2024 •

edited

Loading

codecov bot commented Feb 12, 2024 •

edited

Loading

lalitb commented Feb 12, 2024 •

edited

Loading