Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

awss3export to support gzip #27872

Closed
rahnarsson opened this issue Oct 20, 2023 · 6 comments · Fixed by #31622
Closed

awss3export to support gzip #27872

rahnarsson opened this issue Oct 20, 2023 · 6 comments · Fixed by #31622
Labels
enhancement New feature or request exporter/awss3

Comments

@rahnarsson
Copy link

Component(s)

exporter/awss3

Is your feature request related to a problem? Please describe.

No

Describe the solution you'd like

When using AWS S3 exporter it would be nice to have ability to select compression for stored objects i.e. gzip

Describe alternatives you've considered

Current workflow for storing gzipped data is to use AWS Kinesis Exporter and then through Firehose store data to s3 gzipped. But having ability to straightly upload gzipped files to S3 would drop the need to use separate Kinesis stream for just gzipping the data.

Additional context

No response

@rahnarsson rahnarsson added enhancement New feature or request needs triage New item requiring triage labels Oct 20, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1
Copy link
Member

Sounds like a good idea to me!

For solution details, most of the changes could be done with an added config option and then changes to this method:

func (s3writer *s3Writer) writeBuffer(_ context.Context, buf []byte, config *Config, metadata string, format string) error {

The AWS s3 uploader's Upload method that we're already using takes another optional argument ContentEncoding. We can specify gzip here when the config option is enabled to properly upload data.

I believe using golang's compress/gzip module, we could create a NewWriter on a new buffer, then call Write on the passed in variable buf, and lastly call Close to ensure data is written to the new buffer before uploading the newly compressed buffer contents.

We could just have an if statement for the new config option and keep existing functionality or use the gzip logic. I believe we'd want gzip disabled by default so this isn't a breaking change.

@crobert-1 crobert-1 removed the needs triage New item requiring triage label Nov 20, 2023
@atoulme
Copy link
Contributor

atoulme commented Jan 3, 2024

What's the rotation strategy? Do we accumulate data into gzip and update in place files? Do we upload in bigger chunks? Please help understand how this would work.

@rahnarsson
Copy link
Author

My original idea was just that if the exporter now uploads raw json files and now with the new config option there would be possibility to upload that same json but just with compressed format.

All the other logic like rotation etc. would work just like now with json writer.

@bortojac
Copy link

bortojac commented Mar 1, 2024

@atoulme @pdelewski Is this work still being considered? Would love to use this if so.

@atoulme
Copy link
Contributor

atoulme commented Mar 1, 2024

It's still being considered. Contributions welcome!

dmitryax pushed a commit that referenced this issue Mar 9, 2024
**Description:**
Add `compression` option to compress files using `compress/gzip` library
before uploading to S3.

**Link to tracking Issue:**
Fixes
#27872

**Testing:** 

Sent n number of traces through the S3 exporter using k6 to compare
sizes. Used Minio as the S3 backend.
| Marshaler | Compression | k6 Requests | k6 Data Sent | S3 Objects | S3
Total Size |
| --- | --- | --- | --- | --- | --- |
| otlp_json | No | 101 | 118 KB | 101 | 36 KB | 
| otlp_proto | No | 101 | 118 KB | 101 | 11 KB | 
| otlp_json | Yes | 101 | 118 KB | 101 | 21 KB | 
| otlp_proto | Yes | 101 | 118 KB | 101 | 9.9 KB | 

Additionally, new unit test to check file name.

**Documentation:**
- Updated README.md file
DougManton pushed a commit to DougManton/opentelemetry-collector-contrib that referenced this issue Mar 13, 2024
**Description:**
Add `compression` option to compress files using `compress/gzip` library
before uploading to S3.

**Link to tracking Issue:**
Fixes
open-telemetry#27872

**Testing:** 

Sent n number of traces through the S3 exporter using k6 to compare
sizes. Used Minio as the S3 backend.
| Marshaler | Compression | k6 Requests | k6 Data Sent | S3 Objects | S3
Total Size |
| --- | --- | --- | --- | --- | --- |
| otlp_json | No | 101 | 118 KB | 101 | 36 KB | 
| otlp_proto | No | 101 | 118 KB | 101 | 11 KB | 
| otlp_json | Yes | 101 | 118 KB | 101 | 21 KB | 
| otlp_proto | Yes | 101 | 118 KB | 101 | 9.9 KB | 

Additionally, new unit test to check file name.

**Documentation:**
- Updated README.md file
XinRanZhAWS pushed a commit to XinRanZhAWS/opentelemetry-collector-contrib that referenced this issue Mar 13, 2024
**Description:**
Add `compression` option to compress files using `compress/gzip` library
before uploading to S3.

**Link to tracking Issue:**
Fixes
open-telemetry#27872

**Testing:** 

Sent n number of traces through the S3 exporter using k6 to compare
sizes. Used Minio as the S3 backend.
| Marshaler | Compression | k6 Requests | k6 Data Sent | S3 Objects | S3
Total Size |
| --- | --- | --- | --- | --- | --- |
| otlp_json | No | 101 | 118 KB | 101 | 36 KB | 
| otlp_proto | No | 101 | 118 KB | 101 | 11 KB | 
| otlp_json | Yes | 101 | 118 KB | 101 | 21 KB | 
| otlp_proto | Yes | 101 | 118 KB | 101 | 9.9 KB | 

Additionally, new unit test to check file name.

**Documentation:**
- Updated README.md file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request exporter/awss3
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants