Method and means for utilizing device long busy response for resolving detected anomalies at the lowest level in a hierarchical, demand/response storage management subsystem
First Claim
1. A method for detecting and correcting a defective operating state or condition of a hierarchical demand/responsive storage subsystem of the passive fault management type attaching a host CPU, said subsystem including a plurality of cyclic, tracked storage devices, an interrupt-driven, task-switched control logic, and means responsive to the control logic for forming at least one path of a set of paths coupling the host to at least one device, said host enqueuing one or more read and write requests to said subsystem, said subsystem control logic responsively interpreting each request and establishing a path to an addressed storage device, comprising the steps at the subsystem of:
- (a) detecting an anomaly in the read back or staging of data from the device and executing a retry of the counterpart request by active or passive querying of said addressed device;
(b) in the event that the detected anomaly persists, presenting a long busy status signal to the host CPU by the control logic, said long busy signal being an indication that the counterpart request has yet to be completed by the subsystem;
(c) inhibiting host access to the device by the control logic for no more than a predetermined time interval;
(d) ascertaining whether the inhibited device has returned to an operational state, and(1) in the event the anomaly is resolved, setting an attention interrupt in the control logic by the device and terminating the device long busy signal in the host CPU by the control logic, and(2) in the event that the time interval has been exceeded and the anomaly is not resolved, invoking one or more data recovery procedures including resetting the device by the control logic; and
(e) reporting status to the host CPU.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and means within a hierarchical, demand/response DASD subsystem of the passive fault management type in which, upon the occurrence of fault, error, or erasure, a long device busy signal of finite duration is provided to a host CPU. Any DASD storage device subject to the anomaly is isolated from any host inquiry during this interval. These measures permit retry or other recovery procedures to be implemented transparent to the host and the executing application. This avoids premature declarations of faults, errors, or erasures and consequent host application aborts and other catastrophic measures. If the detected anomaly is not resolved within the allotted time, then other data recovery procedures can be invoked including device reset, the status reported to the host, and the next request processed.
-
Citations
6 Claims
-
1. A method for detecting and correcting a defective operating state or condition of a hierarchical demand/responsive storage subsystem of the passive fault management type attaching a host CPU, said subsystem including a plurality of cyclic, tracked storage devices, an interrupt-driven, task-switched control logic, and means responsive to the control logic for forming at least one path of a set of paths coupling the host to at least one device, said host enqueuing one or more read and write requests to said subsystem, said subsystem control logic responsively interpreting each request and establishing a path to an addressed storage device, comprising the steps at the subsystem of:
-
(a) detecting an anomaly in the read back or staging of data from the device and executing a retry of the counterpart request by active or passive querying of said addressed device; (b) in the event that the detected anomaly persists, presenting a long busy status signal to the host CPU by the control logic, said long busy signal being an indication that the counterpart request has yet to be completed by the subsystem; (c) inhibiting host access to the device by the control logic for no more than a predetermined time interval; (d) ascertaining whether the inhibited device has returned to an operational state, and (1) in the event the anomaly is resolved, setting an attention interrupt in the control logic by the device and terminating the device long busy signal in the host CPU by the control logic, and (2) in the event that the time interval has been exceeded and the anomaly is not resolved, invoking one or more data recovery procedures including resetting the device by the control logic; and (e) reporting status to the host CPU. - View Dependent Claims (2, 3, 4)
-
-
5. In a hierarchical demand/response storage subsystem of the passive fault management type, said subsystem being responsive to read and write requests from a host CPU for establishing access to at least one of a plurality of cyclic, multitracked storage devices over one path selected from a set of at least two failure-independent paths terminating in said device, said subsystem including means for detecting and correcting a defective operating state or condition in the subsystem or attached devices, whereby said detecting and correcting means further comprise:
-
means for detecting an anomaly in the read back or staging of a binary data stream from a device and for retrying said read back or staging; means for ascertaining whether only one path to the device is operable, whether the anomaly persists after retry and, if so, for presenting a long busy status to the host CPU; means for inhibiting host access to the device for up to a predetermined time interval; means for terminating the long busy status in the host CPU responsive to an attention interrupt from the device indicative that the inhibited device has returned to an operational state and the anomaly has been resolved; means responsive to the time interval having been exceeded and the nonresolution of the anomaly for invoking one or more data recovery procedures including resetting the device; and means for reporting the current status of the device to the host.
-
-
6. An article of manufacture comprising a machine-readable memory having stored therein indicia of a plurality of processor-executable control program steps for detecting and correcting a defective operating state or condition of a hierarchical demand/responsive storage subsystem of the passive fault management type attaching a host CPU, said subsystem including a plurality of cyclic, tracked storage devices, an interrupt-driven, task-switched control logic, and means responsive to the control logic for forming at least one path of a set of paths coupling the host to at least one device, said host enqueuing one or more read and write requests to said subsystem, said subsystem control logic responsively interpreting each request and establishing a path to an addressed storage device, said plurality indicia of control program steps executable at the subsystem include:
-
(a) indicia of a control program step for detecting an anomaly in the read back or staging of data from the device and executing a retry of the counterpart request by active or passive querying of said addressed device; (b) indicia of a control program step in the event that the detected anomaly persists for presenting a long busy status signal to the host CPU by the control logic, said long busy signal being an indication that the counterpart request has yet to be completed by the subsystem; (c) indicia of a control program step for inhibiting host access to the device by the control logic for no more than a predetermined time interval; (d) indicia of a control program step for ascertaining whether the inhibited device has returned to an operational state, and (1) in the event the anomaly is resolved, for setting an attention interrupt in the control logic by the device and for terminating the device long busy signal in the host CPU by the control logic, and (2) in the event that the time interval has been exceeded and the anomaly is not resolved, for invoking one or more data recovery procedures including resetting the device by the control logic; and (e) indicia of a control program step for reporting status to the host CPU.
-
Specification