Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase the Default of Batch Response Max Size #27923

Closed
nisdas opened this issue Aug 14, 2023 · 12 comments · Fixed by #27924
Closed

Increase the Default of Batch Response Max Size #27923

nisdas opened this issue Aug 14, 2023 · 12 comments · Fixed by #27924

Comments

@nisdas
Copy link
Contributor

nisdas commented Aug 14, 2023

In v1.12.1,

This PR was included, which added hard request/response limits for batched rpc requests. Unfortunately this particular change has broken interaction between Prysm and Geth running via the default flags. Many users have run into the following error on updating to v1.12.1 and v1.12.2:

level=error msg="Could not reconstruct full bellatrix block batch from blinded bodies" error="could not fetch execution blocks with txs by hash [0x2ba2e89b5da48fafec3faea50973d098d90a5861a16c4f0788d575c192dc891c 0x035aa077371c512deb63952e98ecc9949f8688c4a6e3a05314ba5d7b1866764f 0x4c6190528a63e024c70720cee8d2386846d922b55de1f8840fa483f5de871299 ....0x76146ea392d432369b7f6b3c152260a25cf658232fd985cc11562c6a427de4c2 0xbcaa5d0d27aa4f5294360791bebb20cc6c46ab143ed916a51ce81cf75fb79957 0x191b69b4edaff888a1c11e94ea7cc0a00a06c635d6b6f6b7d47a1edbaa364c31]: response too large" prefix=sync

With these hard limits being enforced by geth, now any set of execution blocks being requested exceeds the response limit and the whole request fails. A workaround provided to users has been to use this flag with geth:

--rpc.batch-response-max-size=50000000

While the workaround works right now, we would prefer if geth and prysm had no issues communicating with the default set of flags. Is it possible to increase the default max size here ? If that is not viable, how about disabling the hard rpc limit for engine methods ? Communication between the consensus and execution client is already trusted, so it wouldn't be an issue for consensus clients to request a large amount of payloads from the execution client.

@haoei
Copy link

haoei commented Aug 14, 2023

Let me add some information:
execution client: Geth v1.12.2
consensus client: beacon-chain Prysm/v4.0.7
Reproduce method: run geth without --rpc.batch-response-max-size flag and monitor beacon-chain logs.

@jsvisa
Copy link
Contributor

jsvisa commented Aug 14, 2023

how about disabling the hard rpc limit for engine methods?

Agree with that one, as CL <-> EL's RPCs are guaranteed by JWT auth, so the limit can be weakened.

@fjl
Copy link
Contributor

fjl commented Aug 14, 2023

We can definitely increase these limits for the engine API!

@fjl
Copy link
Contributor

fjl commented Aug 14, 2023

@nisdas I'm curious about this statement:

With these hard limits being enforced by geth, now any set of execution blocks being requested exceeds the response limit and the whole request fails.

The default limit is 25MB response size. What is prysm requesting there?

Please note, if you are using the go-ethereum rpc client, that we have updated it in v1.12.1 to better deal with failed batch requests. The new version will correctly return the requests that were fulfilled even if the response is incomplete.

@nisdas
Copy link
Contributor Author

nisdas commented Aug 14, 2023

@fjl When serving block batches in p2p to external peers we request the corresponding execution blocks from geth which is why the response is large.

Please note, if you are using the go-ethereum rpc client, that we have updated it in v1.12.1 to better deal with failed batch requests. The new version will correctly return the requests that were fulfilled even if the response is incomplete.

Yeap, we have updated Prysm to take advantage of the changes in the rpc client for this case.

@fjl
Copy link
Contributor

fjl commented Aug 14, 2023

Please try the PR #27924

@fjl
Copy link
Contributor

fjl commented Aug 14, 2023

Also let us know what limits you think should be applied to the engine endpoint.

@nisdas
Copy link
Contributor Author

nisdas commented Aug 15, 2023

Great, thanks ! Will give the PR a shot

@nisdas
Copy link
Contributor Author

nisdas commented Aug 16, 2023

Hey, @fjl I ran this particular PR on a custom devnet with full blocks and it seems to have worked. Unfortunately we can't test the PR in our testing infrastructure for mainnet until there is a tagged docker image for it.

Also let us know what limits you think should be applied to the engine endpoint.

I think the current limits applied in the PR are fine, if in a future ethereum fork execution payloads become much bigger then we can revisit them but they looks good to me

@jsvisa
Copy link
Contributor

jsvisa commented Aug 16, 2023

until there is a tagged docker image for it.

@nisdas is this image(build with #27924) jsvisa/go-ethereum:v20230816 available for you?

@nisdas
Copy link
Contributor Author

nisdas commented Aug 16, 2023

@jsvisa I can give that a try

@nisdas
Copy link
Contributor Author

nisdas commented Aug 16, 2023

I think I will have difficulties running that in our infra since its a non-official image. After the PR is merged in, and an image is built in hub.docker.com/r/ethereum/client-go , I can use the latest image to test the particular issue out for mainnet. For what its worth I validated that the PR fixed response too large from being returned for any requests to the auth rpc endpoint in a small devnet(geth was returning it in previous runs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
@fjl @jsvisa @haoei @nisdas and others