Revisiting Failure Detection and Consensus in Omission Failure Environments
- 381 Downloads
It has recently been shown that fair exchange, a security problem in distributed systems, can be reduced to a fault tolerance problem, namely a special form of distributed consensus. The reduction uses the concept of security modules which reduce the type and nature of adversarial behavior to two standard fault-assumptions: message omission and process crash. In this paper, we investigate the feasibility of solving consensus in asynchronous systems in which crash and message omission faults may occur. Due to the impossibility result of consensus in such systems, following the lines of unreliable failure detectors of Chandra and Toueg, we add to the system a distributed device that gives information about the failure of other processes. Then we give an algorithm using this device to solve the consensus problem. Finally, we show how to implement such a device in an asynchronous system using some weak timing assumptions.
Unable to display preview. Download preview PDF.
- 2.Aguilera, M., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Communication-efficient leader election and consensus with limited link synchrony. In: 23rd ACM Symposium on Principles of Distributed Computing (PODC), St. Johns, Newfoundland, Canada, pp. 328–337 (2004)Google Scholar
- 3.Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: On implementing Omega with weak reliability and synchrony assumptions. In: 22nd ACM Symposium on Principles of Distributed Computing (PODC), pp. 306–314 (2003)Google Scholar
- 7.Delporte-Gallet, C., Fauconnier, H., Freiling, F.C.: Revisiting failure detection and consensus in omission failure environments. Technical Report AIB-2005-13, RWTH Aachen (June 2005)Google Scholar
- 8.Dolev, D., Friedman, R., Keidar, I., Malkhi, D.: Failure detectors in omission failure environments. Technical Report TR96-1608, Cornell University, Computer Science Department (September 1996)Google Scholar
- 9.Dolev, D., Friedmann, R., Keidar, I., Malkhi, D.: Failure detectors in omission failure environments (brief announcement). In: 16th ACM Symposium on Principles of Distributed Computing, PODC (1997)Google Scholar
- 12.Hadzilacos, V.: Issues of Fault Tolerance in Concurrent Computations. PhD thesis, Harvard University, also published as Technical Report TR11-84 (1984)Google Scholar
- 14.Mostéfaoui, A., Mourgaya, E., Raynal, M.: Asynchronous implementation of failure detectors. In: Dependable Systems and Networks (DSN), pp. 351–360. IEEE Computer Society, Los Alamitos (2003)Google Scholar
- 16.Perry, K.J., Toueg, S.: Distributed agreement in the presence of processor and communication faults. IEEE Transactions on Software Engineering 12(3), 477–482 (1986)Google Scholar