Paxos Continued.. Basic Paxos

Suman Shil
6 min readJun 4, 2022

In the previous article I discussed about the role of consensus protocol in distributed system. I also discussed how Paxos helps to implement a replicated state machine. In this article I will discuss about Paxos in details. Paxos decomposes the consensus problem to the following two approaches.

Basic Paxos

This is the simplest form of Paxos where one or more servers propose values. One of the proposed values is chosen. Only one value is ever chosen and System does not choose a second value. Paxos consensus protocol refers to this simplest form of Paxos. Basic paxos is used to implement distributed write once register.

Multi Paxos

Basic Paxos allows one value to be selected. Multiple instances of basic paxos can be stored as a log to agree on a series of values. Multi-paxos is used to implement distributed state machine.

Requirements of Basic Paxos

Safety : Safety means that the algorithm will not act maliciously.

  1. First safety property of Paxos is that it will choose only one value and it will not choose a second value to override the first value.
  2. When a server believes a value has been chosen, it is actually been chosen by the set of servers in the cluster.

Liveness: Liveness property means that servers will eventually chose a value as long as majority of servers are up and communicating with reasonable timeliness.

  1. First liveness property is that some proposed value is eventually is chosen.
  2. If a value is chosen, the servers will eventually learn about it.

Components of Paxos

Proposer: Accepts commands from client and proposes a value. We can imagine proposer as web servers in a 3 tier architecture. One or more process can act as proposers.

Acceptor: Accepts proposals from proposer. Acceptors agree on the system’s state. It keeps track of state. We can imagine acceptors as Database. Once a state is accepted by all acceptors, all future reads will reflect the state.

Client: Interacts with proposer to mutate state of a system. Clients are like web browser in a web application.

Listeners: Listeners are interested to know which value was chosen.

It takes multiple rounds of message passing between proposers and acceptors to choose a value. In the following section we will discuss why paxos will need multiple rounds.

Problem: Split votes

Consider this scenario when multiple proposers propose values and acceptors accepts the first value it receives. If there are simultaneous proposals, it is possible that no proposer gets majority. In the following section S1, S3 and S5 proposes different values but none of them gets majority. Here “Accepted” is not chosen, a value is chosen only when it is accepted by majority or quorum of acceptors.

Photo courtesy Diego Ongaro

Problem: Conflicting Choices

If acceptors accepts every value that it receives, we may end up in a scenario where an acceptor chooses multiple values. In the following diagram S3 will end up accepting both red and blue. This violates safety protocol which mentions that we can choose only one value.

Photo courtesy Diego Ongaro

This situation can be prevented if once a value is chosen, future proposals must propose or chose the same value. In this case S5 will check if a value is already been accepted (red) and it will propose this value instead of blue. So in the second round S5 will propose the value red. This is 2 phase protocol.

Unfortunately 2 phase commit protocol can fail in the following scenario.

Photo courtesy Diego Ongaro

S3 ends up accepting both red and blue. To avoid this scenario S3 must reject S1 proposal. Paxos orders the proposals and reject the old ones. Paxos does this by using Proposal number with each proposal. Proposal number is like a logical clock which keeps increasing with time. Ballot number has two components.

Round number: Each servers stores maximum round it has seen so far(maxRound). This value must be persisted in a non-volatile storage so that the same value is not used for two proposals. Before making a proposal server increases the max round number.

Server Id: Each server is assigned a unique id.

When Round number is concatenated with server id, it creates a unique proposal number.

Two Phase broadcast approach

Basic paxos consists of two phases.

Prepare Phase: Proposer broadcasts Prepare RPCs to acceptors to find out if any value is chosen. This also blocks any older proposals which has not yet completed.

Accept Phase: Proposer broadcasts Accept RPS to inform acceptors to accept a specific value.

Photo courtesy Diego Ongaro

Basic Paxos scenario 1 (Previous value already chosen)

Photo courtesy Diego Ongaro

In this example S1 proposes value X which is accepted by majority of the servers. When S5 proposes value Y, it finds in the prepare phase from S3 that value X is already chooses. So S5 proposes X instead of Y. So X remains the chosen value instead of Y.

Basic Paxos scenario 2 (Previous value may or may not be chosen)

Photo courtesy Diego Ongaro

When S5 proposes value Y, it receives a response from S3 that it has already accepted value X. At this point S5 is not sure if the value X is chosen or not as it has not contacted all the servers. S5 assumes that the value X is chosen and proposes X instead of Y. The outcome is like previous case where both S1 and S5 achieve consensus on value X.

Basic Paxos scenario 3 (Previous value is not chosen)

Photo courtesy Diego Ongaro

In this scenario S1 proposes value X which is accepted by S1 only when S5 proposes value Y. So the quorum of servers accepts the value Y from S5. When the proposals from S1 reaches S2 and S3, these proposals are considered to be cut-off by S5 proposal is already accepted by quorum of servers. S1 finds out that the value Y is accepted by S3, so in future proposals it will propose value Y.

Liveness

It is possible that proposers can be in a livelock situation where no value is chosen. The system enters in a indefinite loop without any outcome.

Photo courtesy Diego Ongaro

In this case S1 proposes a value X, followed by proposal by S5. When first set of accept messages(3.1) arrive at the cluster, servers response indicate that they have received a more recent proposal from S5 (3.5). Please note that the objective of a proposal messages is to cut off the older proposals. In this case (3.5) proposal from S5 has cut-off S1 proposals. So S1 restarts proposal phase again with value X, when accept messages from S5(3.5 Y) is received by the servers, they reject the accept messages as they have seen more recent proposal from S1(4.1). This process continues indefinitely without reaching a consensus. This indicates basic Paxos can have liveness issue.

Proposers can wait for random time before making another proposal to avoid livelock issue. Multi-Paxos avoid the live lock problem by electing a leader and only leader can make proposals.

Conclusion

In this article we discussed Basic Paxos which is the simplest form of Paxos in the family of Paxos algorithms. We have discussed how basic paxos can be used to achieve consensus. Basic paxos is used to implement Write Once Disrbuted register which is the basic building block of key value stores. In the next article I will discuss how to implement Basic Paxos. Stay tuned.

Reference

http://rystsov.info/

--

--

Suman Shil

Software developer, Father, Optimistic, Eternal learner