Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New component: RabbitMQ Exporter #28891

Open
2 tasks
swar8080 opened this issue Nov 5, 2023 · 11 comments
Open
2 tasks

New component: RabbitMQ Exporter #28891

swar8080 opened this issue Nov 5, 2023 · 11 comments
Labels
Accepted Component New component has been sponsored

Comments

@swar8080
Copy link
Contributor

swar8080 commented Nov 5, 2023

Overiew

Use-cases

Similar use cases as other durable messaging system exporters like Kafka and Pulsar. This could help meet teams where they are that want to do custom telemetry processing but only have access to RabbitMQ

Why prioritize RabbitMQ and not some other queue?

  • RabbitMQ appears to be the second-most popular open-source message queue behind kafka based on google searches, market share, dockerhub downloads, etc.
  • Datadog's Vector telemetry pipeline supports RabbitMQ, as well as Kafka/SQS, but not any other queues
  • RabbitMQ actively maintains a Go client library using the AMQP 0.9.1 Protocol

Other options considered

SQS
SQS is also popular but i'm assuming someone from AWS would need to own/maintain the component

AMQP 1.0 Protocol
This would allow supporting other queues like ActiveMQ with a Go client library maintained by microsoft. However:

  • RabbitMQ users have to install a plugin to use AMQP 1.0, whereas AMQP 0.9.1 is the primary protocol supported
  • Maintenance, testing, and configuration could be more complicated having to support multiple queues with the same component

STOMP Protocol
STOMP would also allow supporting other queues like ActiveMQ, however:

  • The Go client library seems to be maintained by an individual
  • Maintenance, testing, and configuration could be more complicated having to support multiple queues with the same component

JMS
A JMS exporter was requested in #27258 because many queues support it

However, JMS is an API and not a wire-protocol. The implementation of the protocol seems specific to each queue

Example configuration for the component

Unique Configuration

  • publisher
    • routing_key (default = otlp_spans for traces, otlp_metrics for metrics, otlp_logs for logs): The AMQP routing key for the message, which will be delivered to a queue with that name (using the amq.direct exchange)
    • confirm_mode (default = true) whether to wait for confirmation that RabbitMQ successfully received or is unable to process a message. This improves the accuracy of collector metrics on unprocessed data. The tradeoff is lower throughput having to wait for asynchronous confirmation.
    • durable (default = true) whether to instruct RabbitMQ to durably persist messages on disk. When publisher.confirm_mode is true, this may delay confirmation by a few hundred milliseconds , decreasing the pipeline's throughput.
  • endpoint (default =rabbit://localhost:5672): The url of the RabbitMQ broker.

Common Configuration

The below is copied from the Pulsar exporter since it seems relevant to this exporter as well

  • auth / tls settings (TODO)
  • encoding of messages set to RabbitMQ (TODO, need to research current OTEL best-practices)
  • timeout: timeout for sending an individual message
  • connection_timeout: timeout for the establishing a connection to the broker and creating an AMQP channel
  • retry_on_failure
    • enabled:
    • initial_interval: Time to wait after the first failure before retrying; ignored if enabled is false
    • max_interval (default = ?): Is the upper bound on backoff; ignored if enabled is false
    • max_elapsed_time (default = ?): Is the maximum amount of time spent trying to send a batch; ignored if enabled is false
  • sending_queue
    • enabled (default = true)
    • num_consumers: Number of consumers that dequeue batches; ignored if enabled is false
    • queue_size Maximum number of batches kept in memory before dropping data; ignored if enabled is false; User should calculate this as num_seconds * requests_per_second where:
      • num_seconds is the number of seconds to buffer in case of a backend outage
      • requests_per_second is the average number of requests per seconds.

Telemetry data types supported

Logs, metrics, and traces

Is this a vendor-specific component?

  • This is a vendor-specific component
  • If this is a vendor-specific component, I am proposing to contribute and support it as a representative of the vendor.

Code Owner(s)

@swar8080

Sponsor (optional)

@atoulme

Additional context

Is this considered a vendor-specific component that needs to be implemented/maintained by the RabbitMQ team? If it's not then i'm happy to implement this!

There's some more research needed for the design but i'll wait to see if this accepted/sponsored before going down that rabbit hole. Let me know if any extra info would help though

@swar8080 swar8080 added needs triage New item requiring triage Sponsor Needed New component seeking sponsor labels Nov 5, 2023
@crobert-1
Copy link
Member

Sounds like a valid use case and component to me! Please review all the requirements of adding a new component.

From here you'll need a sponsor to be able to move forward. You can join the Collector sig meetings and add this issue to the agenda of the Google doc to get more attention here.

As shared in another component proposal, not all components end up being sponsored, so feel free to go ahead and implement this in your own repository if you're not able to get much traction here soon.

Thanks for the proposal and willingness to contribute!

@crobert-1 crobert-1 removed the needs triage New item requiring triage label Nov 28, 2023
@atoulme
Copy link
Contributor

atoulme commented Nov 29, 2023

Confirm this is not a vendor-specific component, from what I can tell.

@atoulme
Copy link
Contributor

atoulme commented Nov 29, 2023

How much of this exporter would you be able to calc after the kafka exporter? Ideally it could be a thin client and use encoding extensions to do most of the heavy lifting, making maintenance easier.

@swar8080
Copy link
Contributor Author

Thanks @crobert-1 and @atoulme. I was assuming this wouldn't get sponsored and started implementing it just as a learning exercise. Here's what I have so far. This implementation might have optimizations and configuration options that aren't worth the complexity for an alpha component though.

I'll list out the possible scope to help decide if it's worth sponsoring/maintaining. From there I could break the implementation into smaller tasks / pull requests. Lmk what you suggest

Possible MVP

  • Message encoding logic (which I can likely re-use from other exporters like kafka)
  • "Fire-and-forget" messaging semantics that assumes the user has the right queues already configured. This would be with confirm_mode=false, meaning the collector doesn't wait for asynchronous confirmation that the broker received the message.
  • Standard exporter retry/timeout/queue configuration
  • Custom code to restore unhealthy connections to RabbitMQ since the client library doesn't have this (already implemented)
  • Handle RabbitMQ's form of back pressure

Other possible enhancements

  • Wait for asynchronous confirmation that RabbitMQ got the message (already implemented)
  • Support automatic queue creation, or handle asynchronously returned messages that are unroutable as errors
  • Re-use of AMQP channels (i.e. logical connections) to avoid making a few network calls during each batch. This saves ~50ms per batch when locally connecting to an AWS queue in the nearest region. Already implemented but it's likely a premature optimization.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@atoulme
Copy link
Contributor

atoulme commented Mar 6, 2024

@swar8080 I'd be happy to sponsor this component and help it land in contrib.

@atoulme atoulme added Accepted Component New component has been sponsored and removed Stale Sponsor Needed New component seeking sponsor labels Mar 6, 2024
djaglowski pushed a commit that referenced this issue Mar 26, 2024
**Description:**
Sets-up the configuration format and common component boilerplate for
the rabbitmq exporter

Implementation will be in other pull requests

**Link to tracking Issue:**
[28891](#28891)

**Testing:** Standard initial unit tests

**Documentation:** Created README for the component

---------

Co-authored-by: Antoine Toulme <antoine@toulme.name>
@romerod
Copy link

romerod commented Apr 9, 2024

Is this the Exporter Part of #10592? The existing RabbitMQ receiver can't be used to receive the exported data, right or am I missing something?

@swar8080
Copy link
Contributor Author

@romerod yep, same idea as the issue you mentioned. This component is for sending telemetry to rabbitmq. The rabbitmqreceiver is for collecting rabbitmq usage metrics

@romerod
Copy link

romerod commented Apr 10, 2024

Thanks @swar8080, understood this, but which receiver can be used to receive the telemetry on the receiving side?

@swar8080
Copy link
Contributor Author

@romerod gotcha, so there's no component for pulling messages from rabbitmq. This component is just for pushing messages to rabbitmq

evan-bradley pushed a commit that referenced this issue Apr 18, 2024
**Description:** 
This is the completed implementation of the rabbitmq exporter. 

**Link to tracking Issue:** 

#28891

**Testing:**
- Unit tests
- Happy path with rabbitmq running locally and in the cloud, testing
different configuration options
- Error cases
  - Fail to connect during start-up
  - Invalid credentials
- Connection lost midway through publishing to the queue. The component
attempts reconnecting on the next publish attempt
- Concurrent publishing, both with and without connection issues

**Documentation:** 
Updated README with more configuration options

---------

Co-authored-by: Andrzej Stencel <astencel@sumologic.com>
rimitchell pushed a commit to rimitchell/opentelemetry-collector-contrib that referenced this issue May 8, 2024
**Description:**
Sets-up the configuration format and common component boilerplate for
the rabbitmq exporter

Implementation will be in other pull requests

**Link to tracking Issue:**
[28891](open-telemetry#28891)

**Testing:** Standard initial unit tests

**Documentation:** Created README for the component

---------

Co-authored-by: Antoine Toulme <antoine@toulme.name>
rimitchell pushed a commit to rimitchell/opentelemetry-collector-contrib that referenced this issue May 8, 2024
**Description:** 
This is the completed implementation of the rabbitmq exporter. 

**Link to tracking Issue:** 

open-telemetry#28891

**Testing:**
- Unit tests
- Happy path with rabbitmq running locally and in the cloud, testing
different configuration options
- Error cases
  - Fail to connect during start-up
  - Invalid credentials
- Connection lost midway through publishing to the queue. The component
attempts reconnecting on the next publish attempt
- Concurrent publishing, both with and without connection issues

**Documentation:** 
Updated README with more configuration options

---------

Co-authored-by: Andrzej Stencel <astencel@sumologic.com>
@cwegener
Copy link
Contributor

cwegener commented May 10, 2024

This is an exciting component!
I am in a similar situation like @romerod
I'd be interested in using RabbitMQ queues that are distributed around the edge with this exporter to export OTEL in a fashion that decouples the direct OTLP protocol connection.
And then I want to have a central location that goes to all the "edge" locations and collects the OTEL data from each RabbitMQ instance, for which there currently is no rabbitmq receiver.

djaglowski pushed a commit that referenced this issue May 13, 2024
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
@atoulme [volunteered to be the
sponsor](#28891 (comment))
of this component, so I believe he should be listed as a code owner.

From
[CONTRIBUTING.md](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#adding-new-components):
```
A sponsor is an approver who will be in charge of being the official reviewer of the code and become a code owner for the component.
```

**Link to tracking Issue:** <Issue number if applicable>
#28891
jlg-io pushed a commit to jlg-io/opentelemetry-collector-contrib that referenced this issue May 14, 2024
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
@atoulme [volunteered to be the
sponsor](open-telemetry#28891 (comment))
of this component, so I believe he should be listed as a code owner.

From
[CONTRIBUTING.md](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#adding-new-components):
```
A sponsor is an approver who will be in charge of being the official reviewer of the code and become a code owner for the component.
```

**Link to tracking Issue:** <Issue number if applicable>
open-telemetry#28891
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Component New component has been sponsored
Projects
None yet
Development

No branches or pull requests

5 participants