External storage

US 20010020282A1
Filed: 04/17/2001
Published: 09/06/2001
Est. Priority Date: 10/30/1995
Status: Active Grant

First Claim

Patent Images

1. A failure recovery method for use in a data processing system including at least one host system, a plurality of controllers, and an interface cable connecting said host system to said controllers in a daisy chain, said controllers respectively including therein I/O ports being connected to said interface cable and having mutually different IDs, an I/O device being controlled by a group of at least two controllers, the method comprising the steps of:

detecting, when a failure is detected in a controller of said group, a utilization state of said interface cable by a controller as a substitutive unit of a failed controller of said group;

deciding, according to the utilization state of said interface cable, a state of reception by said failed controller of an I/O request from said host system;

suppressing by a substitutive controller, when the I/O request is not yet received by said failed controller as a result of the decision, reception of the I/O request by said failed controller;

adding an ID of an I/O port related to said failed controller to an I/O port of said substitutive controller; and

resetting the I/O port related to said failed controller; and

adding by said substitutive controller, when the I/O request is already received by said failed controller as a result of the decision, the ID of said I/O port related to said failed controller to the I/O port of said substitutive controller and resetting the I/O port related to said failed controller before said host system recognizes a permanent error in said failed controller.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In an external storage, an I/O process is continued without any intervention of a user or a host system at failure of a controller. When a failure occurs in a controller, a host system 10 recognizes the failure of the controller. Before the failure is notified to the user and application to stop the job, the substitutive controller reads the SCSI-ID possessed by an SCSI port of the failed controller from a shared memory, registers the SCSI-ID of the SCSI port to the SCSI port associated with the substitutive controller, and erases by a port address resetting facility 45 of the substitutive controller the SCSI-ID possessed by an SCSI port of the failed controller. Thanks to the provision, since the SCSI-ID specified at issuance of an I/O request is transferred between the controllers, the user or the host system need not alter the I/O request issuing route. Moreover, while the host system does not recognize the error, the transfer can be conducted.

73 Citations

21 Claims

1. A failure recovery method for use in a data processing system including at least one host system, a plurality of controllers, and an interface cable connecting said host system to said controllers in a daisy chain, said controllers respectively including therein I/O ports being connected to said interface cable and having mutually different IDs, an I/O device being controlled by a group of at least two controllers, the method comprising the steps of:
- detecting, when a failure is detected in a controller of said group, a utilization state of said interface cable by a controller as a substitutive unit of a failed controller of said group;
  
  deciding, according to the utilization state of said interface cable, a state of reception by said failed controller of an I/O request from said host system;
  
  suppressing by a substitutive controller, when the I/O request is not yet received by said failed controller as a result of the decision, reception of the I/O request by said failed controller;
  
  adding an ID of an I/O port related to said failed controller to an I/O port of said substitutive controller; and
  
  resetting the I/O port related to said failed controller; and
  
  adding by said substitutive controller, when the I/O request is already received by said failed controller as a result of the decision, the ID of said I/O port related to said failed controller to the I/O port of said substitutive controller and resetting the I/O port related to said failed controller before said host system recognizes a permanent error in said failed controller.
- View Dependent Claims (2, 3, 4)
- - 2. A failure recovery method according to claim 1, wherein, in resetting the I/O port related to said failed controller, reset is carried out by hardware resetting means in said substitutive controller.
  - 3. A failure recovery method according to claim 1, wherein, in resetting the I/O port related to said failed controller, said substitutive controller further includes the steps of:
    - indicating to said failed controller to reset the I/O port related to said failed controller after lapse of a predetermined period of time; and
      
      adding the ID of the I/O portion related to said failed controller to the I/O port of said substitutive controller within said predetermined period of time.
  - 4. A failure recovery method according to claim 1, wherein said interface cable is a Small Computer Systems Interface bus cable.

5. A data processing system, comprising:
- at least one host system;
  
  a plurality of controllers; and
  
  an interface cable connecting said host system to said controllers in a daisy chain, said controllers respectively including therein I/O ports being connected to said interface cable and having mutually different IDs;
  
  an I/O device being commonly controlled by a group of at least two controllers; and
  
  a shared memory being commonly accessed from said group, each of controllers in said group including a microprocessor, the microprocessor in each of said controllers including;
  
  means for detecting a failure in a controller of said group according to contents of said shared memory;
  
  means for detecting a utilization state of said interface cable via an I/O port;
  
  means for deciding, according to the utilization state of said interface cable, a state of reception by said failed controller of an I/O request from said host system;
  
  means for suppressing, when the I/O request is not yet received by said failed controller as a result of the decision, reception of the I/O request by said failed controller;
  
  adding an ID of the I/O port related to said failed controller to an I/O port of a controller of its own; and
  
  indicating to reset the I/O port related to said failed controller; and
  
  means for adding, when the I/O request is already received by said failed controller as a result of the decision, the ID of the I/O port related to said failed controller to the I/O port of the controller of its own; and
  
  indicating to reset the I/O port related to said failed controller before said host system recognizes a permanent error in said failed controller.
- View Dependent Claims (6, 7, 8)
- - 6. A data processing system according to claim 5, wherein each of the controllers of said group includes hardware resetting means responsive to an indication from said reset indicating means for resetting the I/O port related to said failed controller.
  - 7. A data processing system according to claim 5, wherein:
    - said reset indicating means writes a failure flag at a predetermined address in said shared memory, said flag indicating an occurrence of a failure;
      
      a processor in said failed controller functions as means for reading said failure flag from said shared memory and resetting the I/O port related thereto after lapse of a predetermined period of time; and
      
      said reset indicating means adds the ID of the I/O port related to said failed controller to the I/O port related to own controller within said predetermined period of time.
  - 8. A data processing system according to claim 5, wherein said interface cable is an SCSI bus cable.

9. An external storage for use in a data processing system including a host system, an external storage including a plurality of controllers respectively having therein ports possessing identifiers as individual port addresses and a group of storages controlled by and shared between said plural controllers, and an interface cable connecting in a daisy chain said host system to said plural controllers having the ports therein, said plural controllers and storages being accessible from said host system, said external storage having a function that at occurrence of a failure in a controller excepting at least one controller, a normal controller detects the failure, references a port address of a failed controller, receives control information of said failed controller, and adds control information to the port address thereof.
- View Dependent Claims (10)
- - 10. An external storage according to claim 9, further including a shared memory for each of said plural controllers for storing therein the port address and control information of each of said controllers and thereby transmitting information between said controllers.

11. An external storage in a data processing system including host system, an external storage including a plurality of controllers respectively having therein ports possessing identifiers as individual port addresses and a group of storages controlled by and shared between said plural controllers, and an interface cable connecting in a daisy chain said host system to said plural controllers having the ports therein, said plural controllers and storages being accessible from said host system, said external storage having a function that at occurrence of a failure in a controller excepting at least one controller, a normal controller detects the failure, references a port address of a failed controller, receives control information of said failed controller, and adds the control information to the port address thereof, a controller having a port address resetting facility for resetting the port address of said failed controller and erasing an ID thereof in such a manner that the controller resets the port address of said failed controller, that said failed controller does not respond to subsequent I/O requests from said host system, and that said normal controller having received the port address responds to the I/O requests.
- View Dependent Claims (12, 13, 14, 15)
- - 12. An external storage according to claim 11, wherein, at occurrence of the failure in the controller, in a state in which said host system has not executed an I/O request to said failed controller and said interface cable connecting said host system to said controllers is not being used, a normal controller executes selection for said failed controller to acquire a bus mastership between said normal controller and said failed controller, thereby suppressing issuance of an I/O request from said host system to said failed controller during a transfer process of the port address by said normal controller.
  - 13. An external storage according to claim 11, wherein, at occurrence of the failure in the controller, in a state in which said host system has not executed an I/O request to said failed controller and said normal controller is using the bus, said normal controller completes the transfer process of the port address of said failed controller during the processing of the I/O request issued from said host system and then notifies termination of the I/O request, thereby suppressing issuance of an I/O request from said host system to said failed controller during the transfer process of the port address by said normal controller.
  - 14. An external storage according to claim 11, wherein;
    - said interface cable is an SCSI cable;
      
      said normal controller monitors, when the bus is in use at occurrence of the failure in the controller, a BSY signal of the bus to determine whether or not the bus is being used by another device connected to the bus, whether or not the system is in a transit state from an arbitration phase to a selection phase according to the SCSI standards, and whether or not said failed controller already received an I/O request from said host system, said normal controller executes, when the bus is released during the monitor operation, selection for said failed controller to attain a bus mastership between said normal and failed controllers, said normal controller completes, when said normal controller is selected during the monitor operation, the transfer process of the port address of said failed controller during the processing of the I/O request issued from said host system and then notifies termination of the I/O request, and said normal controller terminates during the monitoring period the transfer process of the port address of said failed controller.
  - 15. An external storage according-to claim 14, wherein the monitoring period of the bus mastership is set to be equal to or more than a period of time in which the arbitration phase is changed via the selection phase to a message out phase according to the SCSI standards so as to confirm that the BSY signal is not associated with arbitration of the bus mastership but is caused by an I/O execution process, thereby executing the transfer of the port address of said failed controller.

16. An external storage in a data processing system including a host system, an external storage including a plurality of controllers respectively having therein ports possessing identifiers as individual port addresses and a group of storages controlled by and shared between said plural controllers, and an interface cable connecting in a daisy chain said host system to said plural controllers having the ports therein, said plural controllers and storages being accessible from said host system, wherein:
- at occurrence of a failure in a controller excepting at least one controller, a failed controller recognizes the failure thereof and enters a wait state without executing a control operation thereof in at least a period of time equal to time in which said normal controller conducts a transfer process of control information of said failed controller and addition of a port address;
  
  after said normal controller which recognized the failure finishes the transfer and addition processes, said failed controller erases the port address of said failed controller; and
  
  said normal controller which received the port address of said failed controller responds to a subsequent I/O request issued from said host system since the port address of said failed controller is already erased.
- View Dependent Claims (17, 18, 19, 20)
- - 17. An external storage according to claim 16, wherein at occurrence of the failure in the controller, in a state in which said host system has not executed an I/O request to said failed controller and said interface cable connecting said host systems to said controllers is not being used, said normal controller executes selection for said failed controller to acquire a bus mastership between said normal controller and said failed controller, thereby suppressing issuance of an I/O request from said host system to said failed controller during the transfer process of the port address by said normal controller.
  - 18. An external storage according to claim 16, wherein, at occurrence of the failure in a controller, in a state in which a host system has not executed an I/O request to said failed controller and said normal controller is using the bus, said normal controller completes the transfer process of the port address of said failed controller-during the processing of the I/O request issued from said host system and then notifies termination of the I/O request, thereby suppressing issuance of an I/O request from said host system to said failed controller during the transfer process of the port address by said normal controller.
  - 19. An external storage according to claim 16, wherein:
    - when the bus is in use at occurrence of the failure in the controller, said normal controller monitors a BSY signal of the bus to determine whether or not the bus is being used by another device connected to the bus, whether or not the system is in a transit state from an arbitration phase to a selection phase according to the SCSI standards, and whether or not said failed controller already received the I/O request from said host system;
      
      when the bus is released during the monitor operation, the normal controller executes selection for said failed controller to attain a bus mastership between said normal and failed controllers;
      
      when said normal controller is selected during the monitor operation, said normal controller completes the transfer process of the port address of said failed controller during the processing of the I/O request issued from said host system and then notifies the termination of the I/O request; and
      
      said normal controller terminates during the monitoring period the transfer process of the port address of said failed controller.
  - 20. An external storage according to claim 16, wherein the monitoring period of the bus mastership is set to be equal to or more than a period of time in which the arbitration phase changes via the selection phase to a message out phase so as to confirm that the BSY signal is not associated with arbitration of the bus mastership but is caused by an I/O execution process, thereby executing the transfer of the port address of said failed controller.

21. A host system and an external storage connected by an interface cable in a configuration including a host system, an external storage including a plurality of controllers respectively having therein ports possessing identifiers as individual port addresses and a group of storages controlled by and shared between said plural controllers, and an interface cable connecting in a daisy chain said host system to said plural controllers having the ports therein, said plural controllers and said storages being accessible from said host system, said external storage having a function that at occurrence of a failure in a controller excepting at least one controller, said normal controller detects the failure, references the port address of the failed controller, receives control information of said failed controller, and adds the control information to the port address thereof, said host system having a function that in a state in which a controller having received an I/O request issued from the host system cannot respond thereto due to occurrence of a failure in the controller, said host system monitors an I/O completion report from the controller, issues again the I/O request to said failed controller after lapse of the predetermined monitoring period, executes a recovery process including a resetting operation, recognizes a permanent error when the controller does not respond to the recovery process, and notifies the error to the application, and said normal controller completing an operation including the reference, transfer, and additional port address processes before the permanent error is recognized, thereby preventing a report of the permanent error to an application of said host system.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Akira Murotani, Hidehiko Iwasaki, Kenji Muraoka, Toshio Nakano
Original Assignee
Akira Murotani, Hidehiko Iwasaki, Kenji Muraoka, Toshio Nakano
Inventors
Murotani, Akira, Muraoka, Kenji, Iwasaki, Hidehiko, Nakano, Toshio

Granted Patent

US 6,412,078 B2
Time in Patent Office

Days
Field of Search
US Class Current

714/9
CPC Class Codes

G06F 11/1456   Hardware arrangements for b...

G06F 11/1466   to make the backup process ...

G06F 11/1469   Backup restoration techniques

G06F 11/2005   using redundant communicati...

G06F 11/2007   using redundant communicati...

G06F 11/2017   where memory access, memory...

G06F 11/2092   Techniques of failing over ...

External storage

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

73 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

External storage

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

73 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links