Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/txpool/blobpool: 4844 blob transaction pool #26940

Merged
merged 14 commits into from Jul 27, 2023

Conversation

karalabe
Copy link
Member

@karalabe karalabe commented Mar 21, 2023

Background

BlobPool is the transaction pool dedicated to EIP-4844 blob transactions.

Blob transactions are special snowflakes that are designed for a very specific purpose (rollups) and are expected to adhere to that specific use case. These behavioural expectations allow us to design a transaction pool that is more robust (i.e. resending issues) and more resilient to DoS attacks (e.g. replace-flush attacks) than the generic tx pool. These improvements will also mean, however, that we enforce a significantly more aggressive strategy on entering and exiting the pool:

  • Blob transactions are large. With the initial design aiming for 128KB blobs, we must ensure that these only traverse the network the absolute minimum number of times. Broadcasting to sqrt(peers) is out of the question, rather these should only ever be announced and the remote side should request it if it wants to.

  • Block blob-space is limited. With blocks being capped to a few blob txs, we can make use of the very low expected churn rate within the pool. Notably, we should be able to use a persistent disk backend for the pool, solving the tx resend issue that plagues the generic tx pool, as long as there's no artificial churn (i.e. pool wars).

  • Purpose of blobs are layer-2s. Layer-2s are meant to use blob transactions to commit to their own current state, which is independent of Ethereum mainnet (state, txs). This means that there's no reason for blob tx cancellation or replacement, apart from a potential basefee / miner tip adjustment.

  • Replacements are expensive. Given their size, propagating a replacement blob transaction to an existing one should be aggressively discouraged. Whilst generic transactions can start at 1 Wei gas cost and require a 10% fee bump to replace, we suggest requiring a higher min cost (e.g. 1 gwei) and a more agressive bump (100%).

  • Cancellation is prohibitive. Evicting an already propagated blob tx is a huge DoS vector. As such, a) replacement (higher-fee) blob txs mustn't invalidate already propagated (future) blob txs (cumulative fee); b) nonce-gapped blob txs are disallowed; c) the presence of blob transactions exclude non-blob transactions.

  • Malicious cancellations are possible. Although the pool might prevent txs that cancel blobs, blocks might contain such transaction (malicious miner or flashbotter). The pool should cap the total number of blob transactions per account as to prevent propagating too much data before cancelling it via a normal transaction. It should nonetheless be high enough to support resurrecting reorged transactions. Perhaps 4-16.

  • Local txs are meaningless. Mining pools historically used local transactions for payouts or for backdoor deals. With 1559 in place, the basefee usually dominates the final price, so 0 or non-0 tip doesn't change much. Blob txs retain the 1559 2D gas pricing (and introduce on top a dynamic data gas fee), so locality is moot. With a disk backed blob pool avoiding the resend issue, there's also no need to save own transactions for later.

  • No-blob blob-txs are bad. Theoretically there's no strong reason to disallow blob txs containing 0 blobs. In practice, admitting such txs into the pool breaks the low-churn invariant as blob constraints don't apply anymore. Even though we could accept blocks containing such txs, a reorg would require moving them back into the blob pool, which can break invariants.

  • Dropping blobs needs delay. When normal transactions are included, they are immediately evicted from the pool since they are contained in the including block. Blobs however are not included in the execution chain, so a mini reorg cannot re-pool "lost" blob transactions. To support reorgs, blobs are retained on disk until they are finalised.

  • Blobs can arrive via flashbots. Blocks might contain blob transactions we have never seen on the network. Since we cannot recover them from blocks either, the engine_newPayload needs to give them to us, and we cache them until finality to support reorgs without tx losses.

Whilst some constraints above might sound overly agressive, the general idea is that the blob pool should work robustly for its intended use case and whilst anyone is free to use blob transactions for arbitrary non-rollup use cases, they should not be allowed to run amok the network.

Implementation wise there are a few interesting design choices:

  • Adding a transaction to the pool blocks until persisted to disk. This is viable because TPS is low (2-4 blobs per block initially, maybe 8-16 at peak), so natural churn is a couple MB per block. Replacements doing O(n) updates are forbidden and transaction propagation is pull based (i.e. no pileup of pending data).

  • When transactions are chosen for inclusion, the primary criteria is the signer tip (and having a basefee/data fee high enough of course). However, same-tip transactions will be split by their basefee/datafee, preferring those that are closer to the current network limits. The idea being that very relaxed ones can be included even if the fees go up, when the closer ones could already be invalid.

When the pool eventually reaches saturation, some old transactions - that may never execute - will need to be evicted in favor of newer ones. The eviction strategy is quite complex:

  • Exceeding capacity evicts the highest-nonce of the account with the lowest paying blob transaction anywhere in the pooled nonce-sequence, as that tx would be executed the furthest in the future and is thus blocking anything after it. The smallest is deliberately not evicted to avoid a nonce-gap.

  • Analogously, if the pool is full, the consideration price of a new tx for evicting an old one is the smallest price in the entire nonce-sequence of the account. This avoids malicious users DoSing the pool with seemingly high paying transactions hidden behind a low-paying blocked one.

  • Since blob transactions have 3 price parameters: execution tip, execution fee cap and data fee cap, there's no singular parameter to create a total price ordering on. What's more, since the base fee and blob fee can move independently of one another, there's no pre-defined way to combine them into a stable order either. This leads to a multi-dimensional problem to solve after every block.

  • The first observation is that comparing 1559 base fees or 4844 blob fees needs to happen in the context of their dynamism. Since these fees jump up or down in ~1.125 multipliers (at max) across blocks, comparing fees in two transactions should be based on log1.125(fee) to eliminate noise.

  • The second observation is that the basefee and blobfee move independently, so there's no way to split mixed txs on their own (A has higher base fee, B has higher blob fee). Rather than look at the absolute fees, the useful metric is the max time it can take to exceed the transaction's fee caps. Specifically, we're interested in the number of jumps needed to go from the current fee to the transaction's cap:

    jumps = log1.125(txfee) - log1.125(basefee)

  • The third observation is that the base fee tends to hover around rather than swing wildly. The number of jumps needed from the current fee starts to get less relevant the higher it is. To remove the noise here too, the pool will use log(jumps) as the delta for comparing transactions.

    delta = sign(jumps) * log(abs(jumps))

  • To establish a total order, we need to reduce the dimensionality of the two base fees (log jumps) to a single value. The interesting aspect from the pool's perspective is how fast will a tx get executable (fees going down, crossing the smaller negative jump counter) or non-executable (fees going up, crossing the smaller positive jump counter). As such, the pool cares only about the min of the two delta values for eviction priority.

    priority = min(delta-basefee, delta-blobfee)

  • The above very agressive dimensionality and noise reduction should result in transaction being grouped into a small number of buckets, the further the fees the larger the buckets. This is good because it allows us to use the miner tip meaningfully as a splitter.

  • For the scenario where the pool does not contain non-executable blob txs anymore, it does not make sense to grant a later eviction priority to txs with high fee caps since it could enable pool wars. As such, any positive priority will be grouped together.

    priority = min(delta-basefee, delta-blobfee, 0)

Optimisation tradeoffs:

  • Eviction relies on 3 fee minimums per account (exec tip, exec cap and blob cap). Maintaining these values across all transactions from the account is problematic as each transaction replacement or inclusion would require a rescan of all other transactions to recalculate the minimum. Instead, the pool maintains a rolling minimum across the nonce range. Updating all the minimums will need to be done only starting at the swapped in/out nonce and leading up to the first no-change.

Implementation

Storage

The blobpool needs to persist two types of data: transactions that are currently in the pool waiting for inclusion (so we might restart the node without redownloading data; or to be able to track more txs than would fit comfortably in RAM); and transactions that have been recently included (so we can resurrect them on reorgs, since the blobs, commitments and proofs are not part of the canonical tx on chain).

For both these scenarios, the blobpool uses https://github.com/holiman/billy. Billy is a very simplistic data store that has fixed item-sized buckets, represented by flat files. It a bucket is saturated, new items are appended to the corresponding file; whereas if there's a deleted item, the slot is tracked to be filled by a new addition. Upon startup, Billy compacts the files to remove any accumulated gaps.

Since blob transactions can contain between 1-4 blobs, most of the blob transactions are expected to have sizes of 128/256/384/512KB + some overhead from the tx consensus data itself. The PR uses 4KB as a sane overhead limit, expecting most transactions to fit into the 128/256/384/512KB + 4KB item-sized buckets. However, users might end up sending transactions with large input data (pool permits 1MB), so the defined Billy bucket sizes go in 128KB increments all the way up to 512KB + 1MB (for 4x128KB blobs + 1MB max tx size).

Limbo

When the chain does mini reorgs, previously included transactions might get lost from the new canonical chain. To keep things user friendly, the transactions pool constantly monitors the chain for reorgs and whenever one happens, all lost transactions are (attempted to be) moved back into the tx pool to re-include in later blocks.

For blob transactions however, the chain only contains the consensus data of the blob transaction, but does not contain the blob/commit/proof sidecar needed to actually re-include such transactions in future blocks. The only way to resurrect lost blob transactions back into the pool is to track the sidecar data outside the chain and reassemble when something is reorged out.

The blobpool introduces the notion of the Limbo, which is a small database wrapping a Billy. The Limbo is used as a data dump for recently included but not yet finalised blob sidecars. Whenever blob transactions are included on chain, the sidecars are persisted to the Limbo. As the chain progresses and blocks are finalised, sidecars older than finality will be evicted as they cannot be reorged out any more (or we have bigger issues than a lost blob). If a reorg happens, any blob transaction lost can be reassembled by their consensus data from the old chain and their blob data from the limbo; and thus placed back into the blobpool for inclusion.

Note, any transaction that does not get lost but changes it's inclusion height (block number) needs to be updated in the limbo as finality is based on block numbers and we don't want to prematurely flush something that reorged from an older block into a newer block.

Also important to note, the limbo can only ever help resurrect blob transactions that have been propagated through the network. For blob transactions that "just appeared" in a block (MEV), the execution client never sees the sidecar, so can never re-pool them if lost. This would require blobs to be shared by consensus cllients via the engine API, but for now they are against it.

Blobpool

The blobpool is essentially a glorified index around a Billy storing currently pending transactions. Opposed to the Limbo however, the blopbool's database will store full transactions (consensus + blob sidecars), since we need to store everything needed to survive a node restart.

The blobpool maintains 3 sets of indices for different purposes.

  • The actual index and spent track all the necessary metadata about transactions (grouped by account) to validate them, decide whether the pool admits them (sequential gap-free nonces, no overdrafts, etc) and define the inclusion order. These maps maintain a slimmed down form of the consensus struct that keeps only the bare minimum fields needed in RAM. You can find the rules implemented by these maps in the first half of the Background section above.

  • The lookup is a simple hash to transaction (billy id) mapping to allow resolving lazy transactions (the miner operates on unresolved lazy transactions that can be expanded with the data from disk if it's to be included into a block).

  • Lastly, the evictionHeap defines a total order of the transactions based on the current network conditions (base fee, blob fee) that can be used to evict the worse transaction when the blob pool reaches it's capacity. Maintaining this heap is the most expensive computational operation, but due to the 2D nature of fee dynamics (1559 basefee and 4844 blobfee) there's no meaningful optimisation to do apart from making it very fast. The rules of maintaining and updating this heap is listed in the second half of the Background section above.

@holiman
Copy link
Contributor

holiman commented Mar 30, 2023

INFO [03-30|07:43:31.757] Imported new chain segment number=16,938,643 hash=db42f2..f8ece0 blocks=59 txs=10416 mgas=898.236 elapsed=8.095s mgasps=110.959 age=2m20s dirty=442.70MiB
INFO [03-30|07:43:32.488] Imported new chain segment number=16,938,648 hash=ee505b..ab5b75 blocks=5 txs=728 mgas=68.164 elapsed=730.724ms mgasps=93.283 age=1m21s dirty=445.28MiB
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x596ae2]
goroutine 60418 [running]:
math/big.(*Int).Bits(...)
math/big/int.go:105
github.com/holiman/uint256.(*Int).SetFromBig(0xc06277c840?, 0xc05d66bdc0?)
github.com/holiman/uint256@v1.2.2-0.20230321075855-87b91420868c/conversion.go:140 +0x22
github.com/holiman/uint256.MustFromBig(...)
github.com/holiman/uint256@v1.2.2-0.20230321075855-87b91420868c/conversion.go:73
github.com/ethereum/go-ethereum/core/txpool/blobpool.(*BlobPool).Add(0xc00d531040, 0xc06277c840, {0x29abcb0?, 0x0, 0x0})
github.com/ethereum/go-ethereum/core/txpool/blobpool/blobpool.go:684 +0x64b
github.com/ethereum/go-ethereum/eth.newHandler.func6({0xc06275f760?, 0x1, 0x4184f3?})
github.com/ethereum/go-ethereum/eth/handler.go:319 +0xb1
github.com/ethereum/go-ethereum/eth/fetcher.(*TxFetcher).Enqueue(0xc005e64000, {0xc023a21200, 0x40}, {0xc06275f760, 0x1, 0x4}, 0x0)
github.com/ethereum/go-ethereum/eth/fetcher/tx_fetcher.go:297 +0x218
github.com/ethereum/go-ethereum/eth.(*ethHandler).Handle(0xc021f10270?, 0xc0317ed5e8?, {0x1ca9938?, 0xc046d27a28?})
github.com/ethereum/go-ethereum/eth/handler_eth.go:77 +0x1ad
github.com/ethereum/go-ethereum/eth/protocols/eth.handleTransactions({0x1cb4ce0, 0xc004d30480}, {0x1cab9e0?, 0xc06277c7e0}, 0xc0066e3d40)
github.com/ethereum/go-ethereum/eth/protocols/eth/handlers.go:529 +0x302
github.com/ethereum/go-ethereum/eth/protocols/eth.handleMessage({0x1cb4ce0, 0xc004d30480}, 0xc0066e3d40)
github.com/ethereum/go-ethereum/eth/protocols/eth/handler.go:250 +0x577
github.com/ethereum/go-ethereum/eth/protocols/eth.Handle({0x1cb4ce0, 0xc004d30480}, 0xc0066e3d40)
github.com/ethereum/go-ethereum/eth/protocols/eth/handler.go:156 +0x3d
github.com/ethereum/go-ethereum/eth/protocols/eth.MakeProtocols.func1.1(0xc0066e3d40?)
github.com/ethereum/go-ethereum/eth/protocols/eth/handler.go:111 +0x27
github.com/ethereum/go-ethereum/eth.(*handler).runEthPeer(0xc004d30480, 0xc0066e3d40, 0xc021f10360)
github.com/ethereum/go-ethereum/eth/handler.go:505 +0x11fb
github.com/ethereum/go-ethereum/eth.(*ethHandler).RunPeer(0x43?, 0xc0254b0000?, 0x1ca9ed8?)
github.com/ethereum/go-ethereum/eth/handler_eth.go:41 +0x19
github.com/ethereum/go-ethereum/eth/protocols/eth.MakeProtocols.func1(0xc025502d20?, {0x1ca9ed8, 0xc0249390e0})
github.com/ethereum/go-ethereum/eth/protocols/eth/handler.go:110 +0x122
github.com/ethereum/go-ethereum/p2p.(*Peer).startProtocols.func1()
github.com/ethereum/go-ethereum/p2p/peer.go:415 +0x8c
created by github.com/ethereum/go-ethereum/p2p.(*Peer).startProtocolsfunc (bc *BlockChain) stopWithoutSaving() {

from the run on bench07

@karalabe karalabe added this to the 1.12.1 milestone Jul 26, 2023
Copy link
Member

@rjl493456442 rjl493456442 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest change looks good to me. The only nitpick is in legacy txpool, each account also has the restriction for pending/queued transactions. Perhaps we can somehow move the logic from the legacypool to txpool, just unify the behaviour between two pools.

But I am pretty sure it's not a trivial thing. So, this PR is good to me!

Copy link
Member

@MariusVanDerWijden MariusVanDerWijden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@karalabe
Copy link
Member Author

The latest change looks good to me. The only nitpick is in legacy txpool, each account also has the restriction for pending/queued transactions. Perhaps we can somehow move the logic from the legacypool to txpool, just unify the behaviour between two pools.

But I am pretty sure it's not a trivial thing. So, this PR is good to me!

I did think about it, but the main txpool doesn't really have a limit on the sheer number of txs from an account, so somehow introducing something similar would have risky consequences. My thinking was that first we should separate out the locals from the main pool, and then we can have that super lenient, and then we could make the txpool stricter and see where that gets us.

I was even thinking about forbidding non-executable txs altogether from the main pool, that would solve a lot of issues.

@karalabe karalabe merged commit 1662228 into ethereum:master Jul 27, 2023
1 of 2 checks passed
MoonShiesty pushed a commit to MoonShiesty/go-ethereum that referenced this pull request Aug 30, 2023
* core/blobpool: implement txpool for blob txs

* core/txpool: track address reservations to notice any weird bugs

* core/txpool/blobpool: add support for in-memory operation for tests

* core/txpool/blobpool: fix heap updating after SetGasTip if account is evicted

* core/txpool/blobpool: fix eviction order if cheap leading txs are included

* core/txpool/blobpool: add note as to why the eviction fields are not inited in reinject

* go.mod: pull in inmem billy form upstream

* core/txpool/blobpool: fix review commens

* core/txpool/blobpool: make heap and heap test deterministic

* core/txpool/blobpool: luv u linter

* core/txpool: limit blob transactions to 16 per account

* core/txpool/blobpool: fix rebase errors

* core/txpool/blobpool: luv you linter

* go.mod: revert some strange crypto package dep updates
devopsbo3 pushed a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
* core/blobpool: implement txpool for blob txs

* core/txpool: track address reservations to notice any weird bugs

* core/txpool/blobpool: add support for in-memory operation for tests

* core/txpool/blobpool: fix heap updating after SetGasTip if account is evicted

* core/txpool/blobpool: fix eviction order if cheap leading txs are included

* core/txpool/blobpool: add note as to why the eviction fields are not inited in reinject

* go.mod: pull in inmem billy form upstream

* core/txpool/blobpool: fix review commens

* core/txpool/blobpool: make heap and heap test deterministic

* core/txpool/blobpool: luv u linter

* core/txpool: limit blob transactions to 16 per account

* core/txpool/blobpool: fix rebase errors

* core/txpool/blobpool: luv you linter

* go.mod: revert some strange crypto package dep updates
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
@must479
Copy link

must479 commented Mar 18, 2024

Great team

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants