In view of the different characteristics of software and hardware
reliability problems in computer systems, the latest development status of
fault tolerance technology is discussed, and various fault tolerance methods in
computer systems are analyzed, including traditional redundancy design, error
rollback recovery mechanism and general fault tolerance design methods which
are studied more at present. This paper studies the defects of some existing fault
tolerance methods in response delay, fault tolerance cost, accurate quantization,
heterogeneous synchronization, reliability modeling and other aspects as well as
the key problems to be solved, and summarizes how to further improve and use
these fault tolerance methods.