-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for selectively choosing deduplication keys. #102
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening a pull request. I think this is a valid feature to consider.
dedup_keys = job.deduplication_keys if job.respond_to?(:deduplication_keys) | ||
dedup_keys ||= Aws::Rails::SqsActiveJob.config.deduplication_keys.map(&:to_s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to use symbol throughout the method (including for job_id), and then convert to string when doing body.slice. It would eliminate to_s here and in deduplicate_with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use to_s within each definition to uniquely add job_id to keys within the deduplicate_without method and excluded_deduplication_keys getter definitions.
Anyway, I agree with your ideas and would be happy to hear any other good ideas.
# client will be created if none is provided. | ||
# | ||
# @option options [Array] :deduplication_keys ([]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to convey the default here (all properties except job id) since it is not actually an empty array, or maybe the configuration should be changed to be an exclude list instead of an override (which job id can be hard coded appended to), whichever has better customer experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd personally lean towards an exclude list instead of include list. That way the default doesn't need to change if other fields are added in the future. And in general, I'd imagine most use cases would want to just exclude 1 or 2 additional fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revised with feedback from alextwoods :)
module Aws | ||
module Rails | ||
module SqsActiveJob | ||
extend ActiveSupport::Concern |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good use of concern here. Are there any other SQS active job related examples in this gem we can clean up as such?
message_body: instance_of(String), | ||
message_attributes: instance_of(Hash), | ||
message_group_id: Aws::Rails::SqsActiveJob.config.message_group_id, | ||
message_deduplication_id: instance_of(String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we try to assert the default and the override deduplication_id below to ensure that the configuration is working as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this and added some specs.
It looks like test failures are related to MiniTest somehow. The JRuby 9.4 failures appear to be either some deprecation or bug. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good and I think is a useful feature - thanks for submitting!
|
||
ignored_dedup_key = 'job_id' | ||
|
||
if dedup_keys.include?(ignored_dedup_key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to check for an invalid key upfront, when its set rather then every time its used (raise the error where its being made) - I think you could do that in the setter for deduplication_keys.
# client will be created if none is provided. | ||
# | ||
# @option options [Array] :deduplication_keys ([]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd personally lean towards an exclude list instead of include list. That way the default doesn't need to change if other fields are added in the future. And in general, I'd imagine most use cases would want to just exclude 1 or 2 additional fields?
…eys to adopt, one can choose which keys to exclude. That way the default doesn't need to change if other fields are added in the future.
c7e035a
to
472630e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Just some minor comments and I'll take a look at fixing CI ..
# client will be created if none is provided. | ||
# | ||
# @option options [Array] :excluded_deduplication_keys ('job_id') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default should be (['job_id'])
since it's Array
message_body: instance_of(String), | ||
message_attributes: instance_of(Hash), | ||
message_group_id: Aws::Rails::SqsActiveJob.config.message_group_id, | ||
message_deduplication_id: instance_of(String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to strongly assert that message_deduplication_id does not contain job_class and queue_name here? Maybe an implementation detail but we could assert Digest::SHA256.hexdigest
or Aws::Json.dump
doesn't receive them.
The test failures in main should be fixed. |
I pushed a change to main that configured and ran rubocop on everything, as well as bump the minimum rails version to 2.3. Please check to see if main merged in correctly and didn't stomp on any of your changes. |
Please also fix your rubocop errors (new CI I've added) |
Strongly assert that message_deduplication_id does not contain keys to exclude.
Confirmation about Rubocop:
Rubocop also limits the number of lines in the test files, but this may be an unnecessary calibration due to the functionality of RSpec. RSpec is written in a dedicated DSL, which inevitably increases the number of lines per block. If you agree with this, I will add the following code to .rubocop.yml to avoid the warning: # .rubocop.yml
Metrics/BlockLength:
Exclude:
- 'test/**/*.rb'
Metrics/ModuleLength:
Exclude:
- 'test/**/*.rb' |
Note that Rubocop CI still detects errors unless #102 (comment) is applied or fixed individually. |
Yeah, it's fine to add to to-do or global ignore. |
#102 (comment) |
Thanks for your contribution! I will get this released tomorrow. |
Actually, I was able to do this today, please check to see if it works for you! |
Thanks a ton! |
Description of changes:
Previously, when using ContentBasedDeduplication, only job_id was excluded from the deduplication key. However, depending on the requirements of Rails applications, it may be necessary to exclude keys other than job_id as well.
I have addressed this issue by allowing the deduplication key to be customized in the configuration or within each job class.
Please note that, by default, only job_id is excluded.
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license.