System and method for fail-over data transport
First Claim
1. A method for fail-over connection between a first and second processing unit within a computer system, said method comprising the steps of:
- detecting a failure condition across a first communications link connecting said first and second processing units;
selecting, upon said failure condition detection, a secondary communications link between said first and second processing units within said computer system; and
establishing communications between said first and second processing units across said secondary communications link, wherein said first and second processing units each respectively comprise therein;
a device driver module;
a peer transport agent connected to said device driver module; and
a plurality of transports each connected to a respective peer transport agent, said first communications link being between a first transport within said first processing unit and another first transport within said second processing unit, and said secondary communications link being between a second transport within said first processing unit and another second transport within said second processing unit.
4 Assignments
0 Petitions
Accused Products
Abstract
A system and method for maintaining a communications within a computer system after a data transport failure across a first link. Fail-over capability is attained by re-establishing communications across a secondary link using different transport mechanisms. Between two Input/Output Processors (IOPs) within a computer system, such as a server, a series of data transactions therebetween are queued until transaction completion. Upon detection of a failure condition between the TOPs across the first link, the IOPs engage fail-over mechanisms to preserve uncompleted data transactions until communications are re-established across the secondary link.
-
Citations
16 Claims
-
1. A method for fail-over connection between a first and second processing unit within a computer system, said method comprising the steps of:
-
detecting a failure condition across a first communications link connecting said first and second processing units;
selecting, upon said failure condition detection, a secondary communications link between said first and second processing units within said computer system; and
establishing communications between said first and second processing units across said secondary communications link, wherein said first and second processing units each respectively comprise therein;
a device driver module;
a peer transport agent connected to said device driver module; and
a plurality of transports each connected to a respective peer transport agent, said first communications link being between a first transport within said first processing unit and another first transport within said second processing unit, and said secondary communications link being between a second transport within said first processing unit and another second transport within said second processing unit. - View Dependent Claims (2, 3)
releasing, upon communications establishment across said secondary communications link, said first and said another first transports associated with said first communications link.
-
-
4. A method for fail-over connection between a first and second processing unit within a computer system, said method comprising the steps of:
-
detecting a failure condition across a first communications link connecting said first and second processing units;
selecting, upon said failure condition detection, a secondary communications link between said first and second processing units within said computer system; and
establishing communications between said first and second processing units across said secondary communications link, wherein, prior to said failure condition detection, said first processing unit initiates a remote memory allocation in said second processing unit across said first communications link, and wherein, after said communications establishment, said first processing unit accesses said remote memory allocation across said secondary communications link. - View Dependent Claims (5)
-
-
6. A method for fail-over connection between a first and second processing unit within a computer system, said method comprising the steps of:
-
detecting a failure condition across a first communications link connecting said first and second processing units;
selecting, upong said failure condition detection, a secondary communications link between said first and second processing units within said computer system; and
establishing communications between said first and second processing units across said secondary communications link, wherein, upon said failure condition detection, said first processing unit marks a local memory allocation therein as suspended, said local memory allocation being initiated by a remote memory allocation from said second processing unit, and wherein, after said communications establishment, said first processing unit marks said local memory allocation as active.
-
-
7. A computer system comprising:
-
a first processing unit;
a second processing unit, said first and second processing units communicating across a first communications link therebetween; and
fail-over means for maintaining communications between said first and second processing units after detecting a failure condition across said first communication link, said fail-over means connecting said first and second processing units across a secondary communications link, establishing communications therebetween;
wherein said first and second processing units each respectively comprise therein;
a device driver module;
a peer transport agent connected to said device driver module; and
a plurality of transports each connected to a respective peer transport agent, said first communications link being between a first transport within said first processing unit and another first transport within said second processing unit, and said secondary communications link being between a second transport within said first processing unit and another second transport within said second processing unit. - View Dependent Claims (8, 9, 10)
-
-
11. A computer system comprising:
-
a first processing unit;
a second processing unit, said first and second processing units communicating across a first communications link therebetween; and
fail-over means for maintaining communications between said first and second processing units after detecting a failure condition across said first communication link, said fail-over means connecting said first and second processing units across a secondary communications link, establishing communications therebetween;
wherein, prior to said failure condition detection, said first processing unit initiates a remote memory allocation in said second processing unit across said first communications link, and wherein, after said communications establishment, said first processing unit accesses said remote memory allocation across said secondary communications link. - View Dependent Claims (12)
-
-
13. A computer system comprising:
-
a first processing unit;
a second processing unit, said first and second processing units communicating across a first communications link therebetween; and
fail-over means for maintaining communications between said first and second processing units after detecting a failure condition across said first communication link, said fail-over means connecting said first and second processing units across a secondary communications link, establishing communications therebetween;
wherein, upon said failure condition detection, said first processing unit marks a local memory allocation therein as suspended, said local memory allocation being initiated by a remote memory allocation from said second processing unit, and wherein, after said communications establishment, said first processing unit marks said local memory allocation as active. - View Dependent Claims (14, 15, 16)
-
Specification