Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed data ivar from JSON::Validator so that multiple validate call become faster #465

Merged
merged 1 commit into from May 10, 2022

Conversation

ganmacs
Copy link
Contributor

@ganmacs ganmacs commented Feb 16, 2022

In my environment, I call JSON::Validator.validate many times with huge JSON Schema definitions and it takes a lot of time to return value. After some investigations, I found that JSON::Validator#initialize_schema in JSON::Validator#initialize takes most of the time.

This patch removes @data from instance variables of JSON::Validator so that it can avoid a lot of duplicated object creation. it's really useful and efficient when you want to apply the same schema but different data multiple time.

schema = ....
validator = JSON::Validator.new(schema)

validator.validate(data1)
validator.validate(data2)
validator.validate(data3)

flamegraph

code: https://gist.github.com/ganmacs/29f6217bbd3658d7f1f0010d0cadedd7 (json_schema_test.rb)

$ bundle exec ruby json_schema_test.rb
$ stackprof --flamegraph stackprof.dump > flamegraph
$ stackprof --flamegraph-viewer=flamegraph

Before
Screen Shot 2022-02-16 at 14 23 11

After
Screen Shot 2022-02-16 at 14 25 33

benchmark

code: https://gist.github.com/ganmacs/29f6217bbd3658d7f1f0010d0cadedd7 (benchmark.rb)

       user     system      total        real
before 100  0.137017   0.020746   0.157763 (  0.158190)
before 1000  1.349775   0.209938   1.559713 (  1.572352)
before 5000  6.520370   0.991955   7.512325 (  7.585911)
after 100  0.040351   0.010675   0.051026 (  0.051197)
after 1000  0.373515   0.096487   0.470002 (  0.471269)
after 5000  2.010080   0.548916   2.558996 (  2.582410)

@ganmacs ganmacs marked this pull request as ready for review February 16, 2022 05:26
@ganmacs ganmacs marked this pull request as draft February 16, 2022 05:27
@codecov
Copy link

codecov bot commented Feb 16, 2022

Codecov Report

Merging #465 (d96b036) into master (081dfc3) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #465   +/-   ##
=======================================
  Coverage   89.83%   89.84%           
=======================================
  Files          74       74           
  Lines        1555     1556    +1     
=======================================
+ Hits         1397     1398    +1     
  Misses        158      158           
Impacted Files Coverage Δ
lib/json-schema/validator.rb 85.06% <100.00%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 081dfc3...d96b036. Read the comment docs.

@ganmacs ganmacs force-pushed the cache-schema branch 2 times, most recently from 69421f4 to d6e45e5 Compare February 16, 2022 06:12
@ganmacs ganmacs marked this pull request as ready for review February 16, 2022 06:13
@ganmacs ganmacs changed the title Cache schema and re-use it to speed up Removed data ivar from JSON::Validator so that multiple validate call become faster Feb 16, 2022
@bastelfreak
Copy link
Member

@ganmacs thanks for the PR. Could you rebase please to pull in our latest CI changes?

Copy link
Member

@smortex smortex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes make sense, loading the schema and the data at the same time does not really makes sense so changing the API to split these two operations is I think a good move.

It reduces a lot of duplicated object creation when you want to apply
the same schema but different data

```
schema = ....
validator = JSON::Validator.new(schema)

validator.validate(data1)
validator.validate(data2)
validator.validate(data3)
```
@ganmacs
Copy link
Contributor Author

ganmacs commented Feb 17, 2022

Thank you guys for your review. Rebased.

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this, but it looks good to me.

@ekohl ekohl merged commit 5fee4c5 into voxpupuli:master May 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants