-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create infrastructure for the Holesky network #152
Comments
Related PR: eth-clients/holesky#41 |
Here's some questions:
|
This network is going to replace Prater, so we would like to continue all of the practices that we follow on Prater. |
The planned launch date is Sept/15, 2023, 14:00 UTC . |
The current hosts we use from InnovaHosting are like this:
All hosts have 64 GB of RAM, and disk layout can be customized. Do you think any of those processors would fit, or would you want me to ask IH sales about something different? |
I have opened a ticket with Innova Hosting sales team:
https://client.innovahosting.net/viewticket.php?tid=774553&c=wQeslgJE |
Their sales rep responded with:
And I added:
|
We had a bit of back-and-forth:
I said that 800 GB is fine, so they said:
But then it turns out they actually can do the 1.6 TB ones:
I responded to them via email while on holidays but it turns out they don't receive those responses. Confirmed today that is fine and that I need a quote as soon as possible. |
Here's the quote:
Which adds up to a total of 56052 EUR after discounts. |
I have opened process ID 3968 in Spiff. Blocked on approval from Johannes. We really need some short links... |
Because I selected "Infrastructure" project the request was rejected, and I have no way of editing it. |
I have asked them to verify if they purchased the SSDs and they did:
They have also informed us they prefer a payment in USDT. |
I have requested the update on servers setup and got back:
I have informed them that the sooner I can get a few initial servers the sooner I can work on the new setup to get it done in time. |
And looks like It also apparently has stable support for new database model with support for proper pruning:
|
It appears both Erigon and Nethermind are also ready: |
I've upgraded all 3 roles to the versions that support Holesky:
Will roll out new Geth to other fleets. |
Also decided to add verison to Consul metadata:
Which should make it easier to track versions across fleets. |
Got an email Dan Popusoi about some servers being close to becoming available:
So hopefully we should have half the fleet up and running today. We'll see. |
Also, apparently:
So that's cool... I guess. We'll be testing their new setup :D. Hopefully it won't explode. |
#152 Signed-off-by: Jakub Sokołowski <jakub@status.im>
Looks like we will received servers tomorrow in the morning. Which means we will have little time for setup. I have prepared a new layout for this fleet in a branch: https://github.com/status-im/infra-nimbus/tree/holesky-testnet infra-nimbus/ansible/vars/layout/holesky.yml Lines 2 to 184 in 82e76b9
The general idea is that we have 30 hosts split into 3 groups:
The idea is to see behavior of different execution layer nodes across different branches and numbers of validators. |
status-im/infra-nimbus#152 Signed-off-by: Jakub Sokołowski <jakub@status.im>
status-im/infra-nimbus#152 Signed-off-by: Jakub Sokołowski <jakub@status.im>
status-im/infra-nimbus#152 Signed-off-by: Jakub Sokołowski <jakub@status.im>
#152 Signed-off-by: Jakub Sokołowski <jakub@status.im>
Key resources:
|
We got first batch of hosts, but there's some randomness with CPUs:
I will ask support about it. |
Major issues with GitHub timeouts resulted in inability to deploy all nodes on time, we were about 50% ready: The issues were reported to Innova, but not much was improved in time. Lots of timeouts when checking out repos:
Which is kinda the same issue we've been having with InnovaHosting MacOS hosts in CI. |
Here's the config changes:
We'll need to re-deploy validators to use the proper layout once we have all 30 hosts. |
#152 Signed-off-by: Jakub Sokołowski <jakub@status.im>
#152 Signed-off-by: Jakub Sokołowski <jakub@status.im>
#152 Signed-off-by: Jakub Sokołowski <jakub@status.im>
#152 Signed-off-by: Jakub Sokołowski <jakub@status.im>
Added two more hosts that were bootstrapped:
Also fixed installation of Netdata by using their official APT repository:
|
Also found a bug with new Erigon verison where metric type contains labels: Example:
|
#152 Signed-off-by: Jakub Sokołowski <jakub@status.im>
I checked up on EL clients and their readiness for new Holesky genesis on the 28th of September 2023, 12:00 UTC and:
Nethermind has the update and a release, Erigon has the update but no release, but Go-Ethereum seems to have neither: return &Genesis{
Config: params.HoleskyChainConfig,
Nonce: 0x1234,
ExtraData: hexutil.MustDecode("0x686f77206d7563682069732074686520666973683f"),
GasLimit: 0x17d7840,
Difficulty: big.NewInt(0x01),
Timestamp: 1694786100,
Alloc: decodePrealloc(holeskyAllocData),
} The garbage extra data is still there and the timestamp value is the old one. |
#152 Signed-off-by: Jakub Sokołowski <jakub@status.im>
I had to use non-official Docker images for Go-Ethereum and Eirgon to support new Holesky genesis:
This change is temporary until they create proper releases: infra-nimbus/ansible/group_vars/nimbus.holesky.yml Lines 2 to 4 in 7db4374
Deploying now. |
Actually, Erigon just created a release: https://github.com/ledgerwatch/erigon/releases/tag/v2.49.3
|
Looks like Go-Ethereum also had a release 2 hours ago: https://github.com/ethereum/go-ethereum/releases/tag/v1.13.2
Might as well use it. |
I discovered a major issue while checking nodes:
Some nodes were showing non-zero What turns out is that @zah applied the Holesky fix only to the But Based off of the |
I also applied the status-im/nimbus-eth2@cfa0268 commit directly to the |
Looks like instead Zahary merged |
Some nodes were not online for the Echo 0 because I had to purge data folders for all nodes and then restart, which took ages with Ansible due to endless GitHub timeouts on the hosts:
Innova Hosting really need to do something about this. |
Also found a bug in ports configuration for EL nodes that I fixed:
|
A set of missed proposals was identified:
Predictably they cluster around hosts with most validators. |
All of those misses above most probably are explained by port mismatch that was fixed in adc1a061. |
For some reason Erigon nodes have no peer and refuse to connect to anything at all. Opened an issue about it: I tried |
Found another issue with Erigon, this time it's ignoring |
I believe this is now done. |
Looks like finally all the servers now have the same CPU: Xeon E5-2667 v3
|
Nimbus will run 100,000 validator on the new Holesky testnet.
We should plan the addition of at least 30 new servers to our fleet in order to be able to operate such a large number of validators in a heavier-to-process network.
As usual, our validator keys are available here:
https://github.com/status-im/nimbus-private/tree/master/holesky_deposits
The text was updated successfully, but these errors were encountered: