Lighthouse Update #27

What's been happening

Since the last update, a lot has been happening in lighthouse land. In dot-point form:

Altona (multi-client) v0.12.1 testnet successfully launched and actively running.
Preliminary benchmarks of clients demonstrate Lighthouse leading in all tested metrics, including sync speed and memory usage.
Preliminary tests of a new BLS library increasing sync speeds on Altona from 130 slots/sec to 230 slots/sec.
Introduction of a more sophisticated peer scoring/management system which brings greater security and stability to Lighthouse.
CLI options allowing the creation of Lighthouse standalone boot nodes.
Work on the slashing service has started.

Altona

Altona is the latest multi-client testnet targeting v0.12.1. It was created with Lighthouse, Prysm, Teku and Nimbus validators. Although the testnet started smoothly it has had some rough patches revealing bugs in various implementations.

For Lighthouse, this testnet has uncovered a number of bugs that we have been working on resolving over the last week. Some are related to Lighthouse's stability (we were seeing sync stalls and excessive CPU usage), some related to attestation efficiency (some validators were missing attestations). Even with these bugs present, the validator leader board which indicates which validators have made the most Eth since genesis (indicative of liveness, connectivity and general reward performance of a client) demonstrated a number of Lighthouse clients performing well.

We are still focusing on addressing these known bugs, which should improve the performance and ultimately the user's rewards when staking with a Lighthouse client. Most of these bugs have now been fixed in this Pull Request.

BLST

SupraNational has been working on a fast BLS library alongside the Eth2 researchers. We have performed some preliminary tests with their new implementation.

To give some perspective on the processing requirements of a Lighthouse node, the majority of CPU load occurs when processing blocks during sync. Around 60% of block processing time is due to BLS signature verification, so having a fast implementation of BLS is directly related to minimizing CPU usage of a client and sync speed, as the faster we can process a block, the faster we can sync.

In our preliminary tests, we saw previous sync speeds on the Altona testnet of around 130 Slots per second, the new library demonstrated speeds in excess of 230 Slots per second, giving about 75% improvement in our block processing times and hence sync speeds.

There are still further improvements coming to the BLS library and to Lighthouse's sync strategy which should increase these numbers further.

Peer Scoring

Lighthouse was previously using a relatively rudimentary peer management system. Faulty or malicious nodes were kicked and after a short period of time allowed to reconnect. This has a number of short comings for a variety of reasons, including security, as malicious or faulty peers were allowed to reconnect.

A new system has recently been introduced which actively scores a peer based on its behaviour. This allows Lighthouse to rank peers and disconnect and ban peers that are not performing as expected or are actively malicious.

This not only improves the security of Lighthouse by identifying, scoring and ultimately kicking and banning malicious peers, but it also allows us to identify non-performant peers (especially during sync) and to disfavour or disconnect them to gain greater stability.

The scoring system is being introduced in stages, and it will start playing a greater role in more aspects of the client as it evolves and becomes more sophisticated.

Still to do

There are a number of core areas Lighthouse needs to reach before being ready for mainnet. At a high level, the things we need to achieve and what we are currently working on are:

A fully-functioning User Interface - This is currently underway (keep an eye on these updates for its official release).
A slashing service - This is a service that monitors the beacon chain and detects slashable events, keeping validators honest. This too is currently under development.
Complete a standardised API - This also is currently being completed.
Allow Lighthouse to start from a Weak Subjectivity state (this allows Lighthouse to skip sync for a large portion of the chain that is known to be valid).
Gossipsub 1.1 - Protocol Labs have designed a security-enhanced version of gossipsub, which we still need to complete building in rust-libp2p (under active development and should be completed soon).
Upgrade our entire network stack for DOS resistance - This involves sophisticated network message limiting on all protocols and will be part of a phase where Sigma Prime undergoes extensive security testing and upgrades focused solely on improving the security of the client.

We're getting very close!