Simple and Fast Consensus with Zug
The Casper node was designed with a pluggable consensus protocol in mind. So far the only choice was Highway. Casper 2.0.0 has added Zug, a much simpler consensus protocol.
The Zug protocol requires that at most one-third of the validator weight could be attributed to faulty validators. It also assumes that there exists an upper bound for the network delay, which is the duration for a correct validator to deliver a message. The value of the upper bound may be unknown, but it exists. Under these conditions, all correct nodes will reach agreement on a chain of finalized blocks.
Of course, all nodes in a network have to run the same protocol to work together, but when starting a new network or upgrading an existing one, either Highway
or Zug
can now be selected as the consensus_protocol
in the chainspec file. The Casper Mainnet will switch to Zug.
How Zug Works
In every round, the designated leader can sign a proposal message to suggest a block. The proposal also points to an earlier round in which the parent block was proposed.
Each validator then signs an echo message with the proposal's hash. Correct validators only sign one echo per round, so at most one proposal can get echo messages signed by a quorum. A quorum is a set of validators whose total weight is greater than (n + f) / 2
, where n
is the total weight of all validators and f
is the maximum allowed total weight of faulty validators. Thus, any two quorums always have a correct validator in common. As long as n > 3f
, the correct validators will constitute a quorum since (n + f) / 2 < n - f
.
The proposal is accepted if there is a quorum and some other conditions are met (see below). Now, the next round's leader can make a new proposal that uses this proposal as a parent.
Each validator observing the proposal in time signs a Vote(true)
message. If validators time out while waiting, they sign Vote(false)
message instead. If a quorum signs true, the round is committed and the proposal and all its ancestors are finalized. If a quorum signs false, the round is skippable, meaning that the next round's leader can propose a block with a parent from an earlier round. Correct validators only sign either true or false, so a round can be either committed or skippable, but not both.
If there is no accepted proposal, all correct validators will eventually vote false, so the round becomes skippable. This is what makes the protocol live. The next leader will eventually be allowed to make a proposal because either there is an accepted proposal that can be the parent, or the round will eventually be skippable, and an earlier round's proposal can be used as a parent. If the timeout is long enough, the correct proposers' blocks will usually get finalized.
For a proposal to be accepted, the parent proposal must also be accepted, and all rounds between the parent and the current round must be skippable. This is what makes the protocol safe. If two rounds are committed, their proposals must be ancestors of each other because they are not skippable. Thus, the protocol cannot finalize two conflicting blocks.
Of course, there is also a first block. Whenever all earlier rounds are skippable (particularly the first round), the leader may propose a block with no parent.
Every new signed message is optimistically sent directly to all peers. We want to guarantee that it is eventually seen by all validators, even if they are not fully connected. This is achieved via a pull-based randomized gossip mechanism, where a SyncRequest
message containing information about a random part of the local protocol state is periodically sent to a random peer. The peer compares that to its local state and responds with all the signed messages that it has recorded.
The Zug protocol can be summarized as follows:
- In every round, the round leader proposes a new block,
B
. - Every validator creates and broadcasts an echo message, with a signature of
B
. - When a suitable block
B
has received echoes from 67% of the validators:- The next round begins. The next leader can propose a child of
B
. - Every validator signs and broadcasts a vote message, voting
yes
.
- The next round begins. The next leader can propose a child of
- If this does not happen before a timeout, the validators vote
no
instead.- If there are
no
votes from 67%, the next round begins, too. The next leader can propose a child from an earlier block and skip this round.
- If there are
- If there are
yes
votes from 67%,B
is finalized and gets executed, together with all its ancestors. (Usually, the next round has already started at this point.)
Notice that proposals, votes, and echoes are broadcast, so if one correct node receives a message, all nodes will eventually receive it. An honest validator sends only one echo or vote per round. So, unless 34% of validators double-sign, at most one block per round gets 67% echoes, and no finalized block can ever be skipped, ensuring safety. As long as there are 67% of echoes for a proposal, the next round begins and Zug doesn't get stuck. If there are not, everyone votes no
, and the next round also begins.
Expand to see a simple example
Let's review a simple scenario demonstrating the Zug consensus. The example shows five rounds with a different leader and nodes voting on a card suit. The bottom row indicates whether or not the round was finalized. Notice that round 5 was the first finalized round.
In round 1, we had a leader who proposed ♥
, but was slow, so the other nodes timed out and voted no.
The first round had a proposal and was skippable, but nothing was finalized.
In round 2, the second proposer saw ♥
and proposed ♣
as a child of ♥
. Some nodes voted yes
, and some timed out and voted no
. So, round 2 will never output anything because there wasn't a decision.
In round 3, the leader proposed ♦
as a child of ♣
. Assuming the leader was still too slow, everyone voted no
, and round 3 became skippable even though it had a proposal.
In round 4, the proposer might have crashed or been malicious, so everyone timed out and voted no
.
In round 5, the leader didn't see the