9

I've read many papers and slides on Practical Byzantine Fault Tolerance (PBFT) but I'm still confused about why a COMMIT phase is required. Most material states that

  • PREPARE phase ensures fault-tolerant consistent ordering of requests within views
  • COMMIT phase ensures fault-tolerant consistent ordering of requests across views

Some tutorial glosses over that

The PREPARE phase ensures that a majority of correct replicas has agreed on a sequence number for a client’s request. Yet that order could be modified by a new leader elected in a view change.

Can someone show me an example of how COMMIT interacts with view changes?

qweruiop
  • 201
  • 2
  • 5

2 Answers2

6

PBFT is a master piece, for its technical breakthrough and exquisitely precise language. Many descriptions on the protocol details worth reading multiple times to grasp all the nuances.

I will:

  1. quote the original paper (some math notation expressed in Latex, I will use pseudo code instead)
  2. add on my understanding/interpretation.
  3. Q&A myself important questions
  4. Answer your question directly by giving out an example

Pre-Prepare and Prepare phases

We define the predicate prepared(m,v,n,i) ....

The pre-prepare and prepare phases of the algorithm guarantee that non-faulty replicas agree on a total order for the requests within a view. More precisely, they ensure the following invariant: if prepared(m,v,n,i) is true then prepared(m',v,n,j) is false for any non-faulty replica j (including i = j) and any m' such that D(m') !=D(m) .

This means, once reached "prepared" stage (i.e. one replica has received more than 2f+1 PREPARE), this replica, if non-faulty, could be certain that, this message m in this view v is associated with this sequence number n.

Commit phase

The commit phase ensures the following invariant: if committed-local(m,v,n,i) is true for some non-faulty i then committed(m,v,n) is true. This invariant and the view-change protocol described in Section 4.4 ensure that non-faulty replicas agree on the sequence numbers of requests that commit locally even if they commit in different views at each replica.

^^^ This is the clue for me to go back and carefully read the definition of "commit-local" over and over:

and committed-local(m,v,n,i) is true if and only if prepared(m,v,n,i) is true and has accepted 2f+1 commits (possibly including its own) from different replicas that match the pre-prepare for m; a commit matches a pre-prepare if they have the same view, sequence number, and digest.

^^^ notice that? match the pre-prepare, not match the prepare, why? remember "PREPARE" gives certainty on ordering within a view, "PRE-PREPARE for m" emphasize on certainty of sequence number regardless of which view, as long as commit matches the pre-prepare.

Maybe you know why by not, but if it is still not clear, please read on.

Q:Why COMMIT phase necessary?

  1. There are replicas (non-faulty or otherwise) that didn't receive enough (i.e. 2f+1) PREPARE messages, either due to lossy network or being offline for a while. For them, they can't reach PREPARED stage. But! But when they heard from 2f+1 replicas broadcasting COMMIT message, they could be certain to commit on (m,v,n,i)
  2. Apparently, a obvious benefit for COMMIT stage is that it accelerate the agreement/consensus process. Intuitively, you could understand it as : I'm on my way to school, Bob told me school is closed today. I couldn't take one man's word as truth. But as I march on, I see many more classmates (who have already checked whether school is closed or not) returning back from school telling me school is closed. Up until majority told me so, I will take their word for it before reaching school myself.

Example of cross view COMMIT

There are 4 replicas in total. For a message m, I, the replica, received a COMMIT: (COMMIT, m, v, n, i=2), meaning in view v, node #2 told me he committed. But since I didn't receive any other commits, I couldn't reach "commit-local" stage.

Now, "NEW-VIEW" message has been passed around. During protocol redo: other replicas multicast PREPARE messages for each message between min-s and max-s, then later I receive another COMMIT (COMMIT, m, v+1, n, i=3)

Now, could I reach "commit-local"? since I have one commit in view v and another in view v+1.

Answer is yes. Because new primary issued a new PREPARE for m, and I had two matching commits -- 2 commits in a 4 replicas system. Good enough.

Hope it helps! Any further comment is welcomed. Cheers!

Alex Xiong
  • 161
  • 1
  • 3
2

PBFT do need the commit phase to ensure that message m is assigned with sequence number n even during view change. This is because if a message m with sequence number n is committed at some replica after 2f+1 COMMITS is received, this (n,m) will be included in NEW-VIEW message thus rebroadcasting to all other nodes by new primary.

This question confused me for quite a long time and today I get it. I will try to explain it clearly but I suggest you be very very familiar with the paper, at least you should know what does "prepared" "commit_local" mean.

Overview of PBFT

The PBFT have three algorithms:

  • Request handling algorithm

    • The client send message to primary with "id == currentView mod |R|"
    • Three messages
      • 1) Pre-prepare(n, m, v, i). The acceptor will accept this if in view v they do not accept another pre-prepare whose sequence number n is assigned with another request m'. The condition "in view v" is important because it ensures that a faulty primary cannot stop a replica by sending a faulty Pre-prepare message.
      • 2) Prepare(n, m, v, i). If a replica gathers enough(2f+1) valid Prepare message it is prepared. Two non-faulty prepared replicas will agree on (n,m). This is because 2f+1 ensures that there is at least a non-faulty node k so k cannot send different messages.
      • 3) Commit(n, m, v, i). If a replica reaches "prepared", it will broadcast a Commit message and wait for another 2f Commits. Then it is committed locally and ensures that at least f+1 non-faulty replica are prepared.
  • View change

    • To ensure liveness.
    • Messages:
      • 1) View-Change: If a node feels timeout it will create a View-Change. This view-change will bring all prepared messages(with proofs) known to it.For better understanding I ignore CHECKPOINT message.
      • 2) New-View: If the new primary gathers 2f+1 View-Change, it will take all prepared messages in it. The latest prepared message is marked as "max-s". The new primary will broadcast and the acceptor will redo for each sequence number.
  • Garbage collection algorithm

    • A way to remove logs after backing up. We don't discuss it in this post.

Why commit phase cannot be omitted?

Commit phase is key to safety during view change

From a high level view, the commit phase ensures this invariant:

If commit_local(n,m,v) is true for some replica i, then commit_local(n,m',v+1) is false for any non-fault replica i'.

Prove:

Commit_local(n,m,v) is reached at some replica i means that 2f+1 replica(Q1) declares prepared. This means that there is some non-faulty replica in Q1 votes for View-Change in New-View message. Hence, there is at least one prepared message (n,m) in New-View message. This message broadcasts to all other replicas with New-View message.

For any replica i' receiving New-View, it may have two status:

1) It already commit (n,m)

2) It hasn't commit (n,m) yet. It may be waiting for COMMITS, or PREPARES, or even PRE-PREPARE, whatever.

In the second case, it will reprocess from pre-prepare(n,m) phase for v+1. So (n,m) is kept during view change. In one word, the commit phase will ensure that if (n,m) is committed at some replica, then (n,m) will be included in NEW-VIEW thus it is sent to all other replicas in view-change phase.

What if we omit commit phase?

What if we omit the commit phase? The safety is destroyed because m can be committed with n at some replicas while another m' with n committed at some other replicas.

Suppose a non-faulty replica commits the request after (m,n) is prepared in view v. This means 2f+1 replicas(Q1) declared they receive pre-prepare for (m,n) in view v.

At the same time, it is possible that no other replica is prepared yet since the network is partially synchronized. After some timeout a view change happens.

Now since no replica is prepared, it is possible that some quorum does not send any view change with sequence number >= n, so in new primary max-s < n. Now new primary will assign n with a new message m' from client's new request. Though at least f+1 non-faulty replica in Q1 receives pre-prepare (n,m) in view v, these old pre-prepares cannot prevent new pre-prepare messages (n, m') in view v+1 from being accepted. They are not in same view! Recall the condition of accepting pre-prepare messages!!

Now, (n,m) is committed in some replica while (n,m') is committed in other replicas during view change.

Conclusion: the commit phase ensures that if (n,m) is committed at some replica, then (n,m) will be included in NEW-VIEW thus it is sent to all other replicas in view-change phase.

greybeard
  • 1,172
  • 2
  • 9
  • 24
aaron.chu
  • 121
  • 1