Optimistic Byzantine Fault Tolerance
International Journal of Parallel, Emergent and Distributed Systems
The primary concern of traditional Byzantine fault tolerance is to ensure strong replica consistency by executing incoming requests sequentially according to a total order. Speculative execution at both clients and server replicas has been proposed as a way of reducing the end-to-end latency. In this article, we introduce optimistic Byzantine fault tolerance. Optimistic Byzantine fault tolerance aims to achieve higher throughput and lower end-to-end latency by using a weaker replica consistency model. Instead of ensuring strong safety as in traditional Byzantine fault tolerance, nonfaulty replicas are brought to a consistent state periodically and on-demand in optimistic Byzantine fault tolerance. Not all applications are suitable for optimistic Byzantine fault tolerance. We identify three types of applications, namely, realtime collaborative editing, event stream processing, and services constructed with conflict-free replicated data types, as good candidates for applying optimistic Byzantine fault tolerance. Furthermore, we provide a design guideline on how to achieve eventual consistency and how to recover from conflicts at different replicas. In optimistic Byzantine fault tolerance, a replica executes a request immediately without first establishing a total order of the message, and Byzantine agreement is used only to establish a common state synchronization point and the set of individual states needed to resolve conflicts. The recovery mechanism ensures both replica consistency and the validity of the system by identifying and removing the operations introduced by faulty clients and server replicas.
Zhao, Wenbing, "Optimistic Byzantine Fault Tolerance" (2015). Electrical Engineering & Computer Science Faculty Publications. 409.