Date of Award
Electrical and Computer Engineering
Fault-tolerant computing, Fault tolerance (Engineering), Transaction systems (Computer systems), computer engineering, Byzantine fault tolerance
Driven by the need for higher reliability of many distributed systems, various replication-based fault tolerance technologies have been widely studied. A prominent technology is Byzantine fault tolerance (BFT). BFT can help achieve high availability and trustworthiness by ensuring replica consistency despite the presence of hardware failures and malicious faults on a small portion of the replicas. However, most state-of-the-art BFT algorithms are designed for generic stateful applications that require the total ordering of all incoming requests and the sequential execution of such requests. In this dissertation research, we recognize that a straightforward application of existing BFT algorithms is often inappropriate for many practical systems: (1) not all incoming requests must be executed sequentially according to some total order and doing so would incur unnecessary (and often prohibitively high) runtime overhead and (2) a sequential execution of all incoming requests might violate the application semantics and might result in deadlocks for some applications. In the past four and half years of my dissertation research, I have focused on designing lightweight BFT solutions for a number of Web services applications (including a shopping cart application, an event stream processing application, Web service business activities (WS-BA), and Web service atomic transactions (WS-AT)) by exploiting application semantics. The main research challenge is to identify how to minimize the use of Byzantine agreement steps and enable concurrent execution of requests that are commutable or unrelated. We have shown that the runtime overhead can be significantly reduced by adopting our lightweight solutions. One limitation for our solutions is that it requires intimate knowledge on the application design and implementation, which may be expensive and error-prone to design such BFT solutions on complex applications. Recognizing this limitation, we investigated the use of Conflict-free Replicated Data Types (CRDTs) to c
Chai, Hua, "Application Aware for Byzantine Fault Tolerance" (2014). ETD Archive. 56.