×

Method for improving recovery performance from hardware and software errors in a fault-tolerant computer system

  • US 5,812,748 A
  • Filed: 05/16/1995
  • Issued: 09/22/1998
  • Est. Priority Date: 06/23/1993
  • Status: Expired due to Term
First Claim
Patent Images

1. A method for rapid recovery from a network file server failure, operating on a computer configuration comprising:

  • a plurality of computer systems adapted for responding to file server requests, each of said plurality of computer systems comprising;

    a computer executing a file server operating system, and a mass storage device connected to said computer;

    an additional computer system, comprising;

    an additional computer which can execute said file server operating system, andan additional mass storage system comprising at least one mass storage device, connected to said additional computer;

    and means for communicating between each of said plurality of computer system and said additional computer system,the recovery method comprising;

    running a mass storage access program on said additional computer, said mass storage access program receiving mirroring data from each computer of said plurality of computer systems over said communicating means and writing said mirroring data to said additional mass storage system;

    and for each computer system in said plurality of computer systems;

    installing a mass storage emulator on said computer system for use by said file server operating system, said mass storage emulator taking mass storage write requests from said file server operating system and sending mirroring data indicative of said write request to said additional computer system over said communicating means;

    initiating mirroring of data by writing said data both to said mass storage device of said computer system and through said mass storage emulator and said mass storage access program to said additional mass storage system, where said mass storage access program and said mass storage emulator makes a portion of said additional mass storage system appear as if said portion of said additional mass storage device were an extra mass storage device connected to said computer of said computer system in the same manner as said mass storage device is connected to said computer of said computer system;

    and then when a failure of any of said plurality of computer systems is detected, performing at least the following steps;

    transferring responsibility for responding to file server requests previously responded to by said failed computer system to said additional computer system; and

    continuing to mirror data from said plurality of computer systems that have not failed to said additional computer system so that said additional computer system both responds to file server requests and mirrors data.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×