Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hash_cache miss when leaf is brought up on sibling deletion #13307

Merged
merged 1 commit into from
May 17, 2024

Conversation

msmouse
Copy link
Contributor

@msmouse msmouse commented May 16, 2024

Description

Original thought was in these rare cases we just waste a constructed Child to make the code simply, however it wasn't considered that there's a panic on hash_cache miss. I think the panic is valuable assertion so this is fixing it by some extra code.

Type of Change

Which Components or Systems Does This Change Impact?

  • Validator Node

How Has This Been Tested?

locally tested by running the replay tool.

unit test coverage was not able to catch this because deletion is not tested enough. Gonna update proptest separately though, to fix forward quickly.

Key Areas to Review

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented May 16, 2024

⏱️ 18h 56m total CI duration on this PR
Job Cumulative Duration Recent Runs
windows-build 3h 18m 🟩🟩🟩🟩🟩
forge-framework-upgrade-test / forge 3h 2m 🟥🟥🟥🟩
execution-performance / single-node-performance 2h 🟥🟥🟥🟩🟩 (+2 more)
rust-smoke-tests 2h 🟩 (+2 more)
forge-e2e-test / forge 1h 35m 🟥🟥🟩🟩 (+1 more)
rust-images / rust-all 1h 🟩🟩 (+2 more)
rust-targeted-unit-tests 53m 🟩🟩 (+4 more)
forge-compat-test / forge 52m 🟩🟩🟩
run-tests-main-branch 36m 🟩🟩🟩🟩 (+4 more)
rust-lints 34m 🟩🟩 (+4 more)
cli-e2e-tests / run-cli-tests 26m 🟩🟩🟥
rust-move-tests 25m 🟩🟩🟩 (+4 more)
execution-performance / test-target-determinator 24m 🟩🟩🟩🟩🟩 (+2 more)
check 23m 🟩🟩🟩 (+2 more)
rust-build-cached-packages 22m 🟩🟩 (+2 more)
test-target-determinator 20m 🟩🟩🟩 (+2 more)
general-lints 14m 🟩🟩🟩 (+4 more)
check-dynamic-deps 12m 🟩🟩🟩🟩🟩 (+4 more)
indexer-grpc-e2e-tests / test-indexer-grpc-docker-compose 6m 🟩🟩🟩
semgrep/ci 4m 🟩🟩🟩🟩🟩 (+4 more)
node-api-compatibility-tests / node-api-compatibility-tests 3m 🟩🟩🟩🟩
file_change_determinator 2m 🟩🟩🟩🟩🟩 (+4 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+2 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+4 more)
permission-check 34s 🟩🟩🟩🟩🟩 (+3 more)
permission-check 30s 🟩🟩🟩🟩🟩 (+4 more)
permission-check 29s 🟩🟩🟩🟩🟩 (+2 more)
permission-check 26s 🟩🟩🟩🟩🟩 (+3 more)
permission-check 26s 🟩🟩🟩🟩🟩 (+3 more)
determine-docker-build-metadata 15s 🟩🟩🟩🟩🟩 (+2 more)

🚨 4 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
cli-e2e-tests / run-cli-tests 9m 7m +38%
run-tests-main-branch 5m 4m +30%
rust-targeted-unit-tests 10m 19m -47%
rust-move-tests 4m 11m -68%

settingsfeedbackdocs ⋅ learn more about trunk.io

@msmouse msmouse requested review from areshand, zekun000 and a team May 16, 2024 21:25
@msmouse msmouse force-pushed the 0516-alden-fix-slice-node branch from 7e48cab to cc67513 Compare May 16, 2024 21:29
@msmouse msmouse added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label May 16, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@msmouse msmouse enabled auto-merge (squash) May 16, 2024 22:24

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@msmouse msmouse force-pushed the 0516-alden-fix-slice-node branch from cc67513 to 0b81f05 Compare May 17, 2024 00:26
@msmouse
Copy link
Contributor Author

msmouse commented May 17, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @msmouse and the rest of your teammates on Graphite Graphite

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@msmouse msmouse disabled auto-merge May 17, 2024 00:45
@msmouse msmouse force-pushed the 0516-alden-fix-slice-node branch from 2094e04 to 0b81f05 Compare May 17, 2024 00:46
@msmouse msmouse removed the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label May 17, 2024
@msmouse msmouse enabled auto-merge (squash) May 17, 2024 00:51
Original thought was in these rare cases we just waste a constructed
Child to make the code simply, however it wasn't considered that there's
a panic on hash_cache miss. I think the panic is valuable assertion so
this is fixing it by some extra code.
@msmouse msmouse force-pushed the 0516-alden-fix-slice-node branch from 0b81f05 to 3b0c139 Compare May 17, 2024 00:53

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on 01b24e7e3548382dd25440b39a0438a993387f12 ==> 3b0c1398c533572d6d57b0cbbb4df46704426a26

Compatibility test results for 01b24e7e3548382dd25440b39a0438a993387f12 ==> 3b0c1398c533572d6d57b0cbbb4df46704426a26 (PR)
1. Check liveness of validators at old version: 01b24e7e3548382dd25440b39a0438a993387f12
compatibility::simple-validator-upgrade::liveness-check : committed: 5721.783165383451 txn/s, latency: 5291.776183457604 ms, (p50: 4800 ms, p90: 9600 ms, p99: 15100 ms), latency samples: 230680
2. Upgrading first Validator to new version: 3b0c1398c533572d6d57b0cbbb4df46704426a26
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1702.4397655718778 txn/s, latency: 17189.76356562137 ms, (p50: 20400 ms, p90: 24400 ms, p99: 25700 ms), latency samples: 86100
3. Upgrading rest of first batch to new version: 3b0c1398c533572d6d57b0cbbb4df46704426a26
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 1600.066275423786 txn/s, latency: 17968.469412724306 ms, (p50: 19900 ms, p90: 23300 ms, p99: 23600 ms), latency samples: 85820
4. upgrading second batch to new version: 3b0c1398c533572d6d57b0cbbb4df46704426a26
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 3587.613825548151 txn/s, latency: 8710.558010895376 ms, (p50: 9700 ms, p90: 12600 ms, p99: 12900 ms), latency samples: 143180
5. check swarm health
Compatibility test for 01b24e7e3548382dd25440b39a0438a993387f12 ==> 3b0c1398c533572d6d57b0cbbb4df46704426a26 passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 3b0c1398c533572d6d57b0cbbb4df46704426a26

two traffics test: inner traffic : committed: 8193.114993635438 txn/s, latency: 4780.274566469098 ms, (p50: 4500 ms, p90: 5600 ms, p99: 10200 ms), latency samples: 3546460
two traffics test : committed: 99.91612903048029 txn/s, latency: 1944.25 ms, (p50: 1900 ms, p90: 2100 ms, p99: 4900 ms), latency samples: 1760
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.210, avg: 0.203", "QsPosToProposal: max: 0.244, avg: 0.223", "ConsensusProposalToOrdered: max: 0.460, avg: 0.410", "ConsensusOrderedToCommit: max: 0.402, avg: 0.384", "ConsensusProposalToCommit: max: 0.806, avg: 0.794"]
Max round gap was 1 [limit 4] at version 1837005. Max no progress secs was 4.970982 [limit 15] at version 1837005.
Test Ok

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite framework_upgrade success on 01b24e7e3548382dd25440b39a0438a993387f12 ==> 3b0c1398c533572d6d57b0cbbb4df46704426a26

Compatibility test results for 01b24e7e3548382dd25440b39a0438a993387f12 ==> 3b0c1398c533572d6d57b0cbbb4df46704426a26 (PR)
Upgrade the nodes to version: 3b0c1398c533572d6d57b0cbbb4df46704426a26
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1351.8042073768763 txn/s, submitted: 1354.3339532566222 txn/s, failed submission: 2.5297458797457706 txn/s, expired: 2.5297458797457706 txn/s, latency: 2302.3004338210276 ms, (p50: 2100 ms, p90: 3800 ms, p99: 5400 ms), latency samples: 117560
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1167.1318346117635 txn/s, submitted: 1169.4279822305157 txn/s, failed submission: 2.2961476187522396 txn/s, expired: 2.2961476187522396 txn/s, latency: 2690.110948258902 ms, (p50: 2100 ms, p90: 4800 ms, p99: 6900 ms), latency samples: 101660
5. check swarm health
Compatibility test for 01b24e7e3548382dd25440b39a0438a993387f12 ==> 3b0c1398c533572d6d57b0cbbb4df46704426a26 passed
Upgrade the remaining nodes to version: 3b0c1398c533572d6d57b0cbbb4df46704426a26
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1332.1155653186202 txn/s, submitted: 1334.1528286569014 txn/s, failed submission: 2.0372633382812007 txn/s, expired: 2.0372633382812007 txn/s, latency: 2628.5145861212004 ms, (p50: 2100 ms, p90: 4500 ms, p99: 6900 ms), latency samples: 104620
Test Ok

@msmouse msmouse merged commit 6bf4d7b into main May 17, 2024
47 of 51 checks passed
@msmouse msmouse deleted the 0516-alden-fix-slice-node branch May 17, 2024 03:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants