Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/geth: add prune history command #31384

Merged
merged 22 commits into from
Mar 21, 2025
Merged

cmd/geth: add prune history command #31384

merged 22 commits into from
Mar 21, 2025

Conversation

s1na
Copy link
Contributor

@s1na s1na commented Mar 13, 2025

Implements #31135

@fjl
Copy link
Contributor

fjl commented Mar 18, 2025

This is basically done now. One thing we could still add is a recovery path when tx index pruning fails or when it is interrupted. Right now, if indexing is interrupted, the index is in an invalid state, but this state is not detectable. We could fix this by adding a field to the database that marks the tx index as invalid. On next startup, we would find this field, and set the index tail to the current head block.

fjl
fjl previously approved these changes Mar 19, 2025
Copy link
Member

@MariusVanDerWijden MariusVanDerWijden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@fjl fjl dismissed stale reviews from MariusVanDerWijden and themself via d2591a7 March 19, 2025 08:07
@fjl fjl added this to the 1.15.6 milestone Mar 19, 2025
@@ -361,3 +363,50 @@ func UnindexTransactions(db ethdb.Database, from uint64, to uint64, interrupt ch
func unindexTransactionsForTesting(db ethdb.Database, from uint64, to uint64, interrupt chan struct{}, hook func(uint64) bool) {
unindexTransactions(db, from, to, interrupt, hook, false)
}

// PruneTransactionIndex removes all tx index entries below a certain block number.
func PruneTransactionIndex(db ethdb.Database, pruneBlock uint64) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to unindex the transactions by traversing the entire tx lookups in db? Why not reuse the UnindexTransaction with block range?

e.g.

// PruneTransactionIndex removes all tx index entries below a certain block number.
func PruneTransactionIndex(db ethdb.Database, pruneBlock uint64) {
	tail := ReadTxIndexTail(db)
	if tail == nil || *tail > pruneBlock {
		return // no index, or index ends above pruneBlock
	}
	UnindexTransactions(db, *tail, pruneBlock, nil, true)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to implement it like this because unindexing everything takes 45 minutes, while this scan takes only 1.5min (on Sepolia).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh really, interesting

@rjl493456442
Copy link
Member

One missing part is that: txindexer has no knowledge about the chain cutoff now, and it will reindex the missing part by default. (e.g. with txlookupLimit = 0)

It could be problematic as block bodies decoder will fail with an empty blob.

@s1na
Copy link
Contributor Author

s1na commented Mar 19, 2025

@rjl493456442 on that note can you please check my PR? :)
#31393

@rjl493456442
Copy link
Member

@rjl493456442 on that note can you please check my PR? :) #31393

Working on it. It's actually pretty complicated

log.Error("Tx index pruning iterator error", "err", iter.Error())
}

WriteTxIndexTail(db, pruneBlock)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's dangerous to delete the transaction indexes first and then update the tail.
As it can lead to a weird situation that corresponding transaction index is not found while the tail indicates it should exist.

Anyway, I think we can reuse the one here https://github.com/s1na/go-ethereum/blob/txlookup-cutoff/core/rawdb/accessors_indexes.go#L106

@rjl493456442
Copy link
Member

The whole idea is pretty straightforward. Let's rebase it on top of txindexer PR, and get it merged

fjl added a commit that referenced this pull request Mar 21, 2025

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
In #31384 we unindex TXes prior to the merge block. However when the
node starts up it will try to re-index those back if the config is to index the
whole chain. This change makes the indexer aware of the history cutoff block,
avoiding reindexing in that segment.

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
@fjl fjl merged commit 8fe09df into ethereum:master Mar 21, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants