This post outlines the current state of Lighthouse's Bellatrix implementation and enumerates the major outstanding tasks.
As expected, the Lighthouse team has been dedicating a lot of time to the Bellatrix upgrade. For those who don't know, the Bellatrix upgrade is what allows the beacon chain to merge with the current proof-of-work Ethereum chain (a.k.a. "The Merge").
Lighthouse, as a piece of software, is mostly compatible with the current Bellatrix spec. But, it's not yet fully conforming with the specs. We must reach full conformance.
The Bellatrix specifications are the result of an intensive specification and development effort flaring up in late-2021 with the Amphora interop in Greece. Since then, there have been several specification and implementation cycles, each resulting in improvements to the spec. Lighthouse has weathered these cycles and is now in a state where it works well in testnets, but is still missing a few important details.
The task now for the Lighthouse team is to implement the last components of the Bellatrix specification so that we can start testing on a stable production-candidate. We expect to see long-lived merge testnets starting in March, so we are setting our sights on a Bellatrix implementation that is feature-complete by the end of February. That should leave Q2-2022 for public testing and deliberation on when the merge should occur.
Not only does Lighthouse need to implement the specification, the specification needs to stabilise. There is currently "turbulence" within the specification as engineers and researchers discover and refine edge-cases. This turbulence is manageable and to-be-expected, however it'll need to be stable before Lighthouse can be stable. To assist, we're striving to be responsive and proactive when it comes to parts of the specification that need work or feedback.
As things stand now, I think we're in a good position to be ready for longer-lived testnets as early as March.
In the interests of understanding the challenge, I've been listing the tasks remaining before Lighthouse is ready in March. To see all the issues and PRs for Bellatrix, see the bellatrix tag.
For those who wish to have a high-level view without necessarily reading all the issues/PRs, all of the major tasks are described below.
Optimistic sync API specs
Over the last couple of months, I've been working on authoring the Optimistic Sync Specification. alongside several other contributors, including Mikhail Kalinin, Adrian Sutton and Danny Ryan.
Optimistic Sync is a rather tricky but crucial component for the merge. It's effectively a stop-gap measure to allow the execution engine (e.g., Geth) to lag behind the head of consensus engine (e.g., Lighthouse) whilst it syncs the Ethereum world-state (e.g., via snap sync). To achieve this, the consensus engine must sometimes import beacon chain blocks without knowing if the execution components (e.g., transactions) are valid or not. We call this type of block import an "optimistic" import.
The last big remaining question for optimistic sync is whether or not the consensus engine API should expose any data that is derived from optimistically imported blocks. There are two broad approaches:
- Never communicate information if it's affected by an optimistic assumption.
- This is strict and "safe" for API consumers.
- Nodes will likely just return a "503 syncing" error whenever optimistic blocks are involved in a request.
- It unfortunately prevents VCs from following the chain during an optimistic sync, this might amplify network instability in some circumstances.
- Communicate information about optimistic blocks via the API
- Allows VCs to follow the duties of an optimistic chain.
- Requires thought on how to communicate to users when they're receiving optimistic data.
There is still not consensus among implementations as to which approach is best. I've done some work in sigp/lighthouse#2946 to draft a hybrid approach that restricts some optimistic data, but still allows VCs to follow duties of an optimistic chain (without signing any messages about it). There is also some Discord discussion with useful input from @jgm.
In the interest of finding a catalyst for progress, I (Paul Hauner) will choose an approach and make a PR to the consensus-spec repository. I expect that will spur debate and/or consensus.
Jim McDonald has a PR open to the beacon-APIs repository to add an endpoint to allow consensus clients to provide "payload attributes" to execution clients. These attributes allow execution clients look-ahead time to prepare the most optimal set of transactions before the consensus client requests a new, fully-formed execution payload.
This task is tracked at sigp/lighthouse#2936.
Retrospectively checking the transition block
The "transition block" is the first block in a chain that includes a non-empty
ExecutionPayload. This is effectively the first merging of the
proof-of-work chain into the proof-of-stake chain.
This block requires some additional verification when compared to pre- and post-transition blocks; the consensus client must check that the PoW chain has reached the terminal total difficulty (with some extra caveats).
Lighthouse is yet to implement this retrospective verification and is tracking the issue at sigp/lighthouse#2983.
Fork choice retrospective invalidation
Part of optimistic sync involves learning that blocks (specifically their execution payloads) are valid or invalid after importing them. This involves removing invalid blocks (and descendants) from the fork choice tree and marking valid blocks (and ancestors) as no longer optimistic.
This is a small but fiddly task and is well under way in sigp/lighthouse#2837.
Restrict optimistic imports
The optimistic sync spec introduces some restrictions around when a block might be optimistically imported. These restrictions mitigate some complex and unlikely attacks that are possible during the merge transition.
The PR for this task is at sigp/lighthouse#2986.
Update to the latest execution API spec
Gossip propagation conditions
The Bellatrix specification modifies gossip propagation conditions for beacon blocks. We must check that Lighthouse adheres to these conditions and add regression tests.
This should be a simple task since there are already tests that check other gossip propagation conditions.
This task is tracked at sigp/lighthouse#2984.