Fault tolerant hypercube computer system architecture
First Claim
1. A fault-tolerant multi-processor computer system of the hypercube type comprising:
- (a) a plurality of first computing nodes;
(b) a first network of message conducting path means for interconnecting said first computing nodes as a hypercube, said first network providing a path for message transfer between said first computing nodes;
(c) a first watch dog node; and
,(d) a second network of message conducting path means for directly connecting each of said first computing nodes to said first watch dog node independent from said first network, said second network providing an independent path for test message and reconfiguration affecting transfers between respective ones of said first computing nodes and said first watch dog node.
2 Assignments
0 Petitions
Accused Products
Abstract
A fault-tolerant multi-processor computer system of the hypercube type comprising a hierarchy of computers of like kind which can be functionally substituted for one another as necessary. Communication between the working nodes is via one communications network while communications between the working nodes and watch dog nodes and load balancing nodes higher in the structure is via another communications network separate from the first. A typical branch of the hierarchy reporting to a master node or host computer (50) comprises, a plurality of first computing nodes (22); a first network of message conducting paths (30) for interconnecting the first computing nodes (22) as a hypercube (28'"'"'), the first network (30) providing a path for message transfer between the first computing nodes (22); a first watch dog node (40); and, a second network of message conducting paths (34) for connecting the first computing nodes (22) to the first watch dog node (40) independent from the first network (30), the second network (34) providing an independent path for test message and reconfiguration affecting transfers between the first computing nodes (22) and the first switch watch dog node (40). There is additionally, a plurality of second computing nodes (22); a third network of message conducting paths (30) for interconnecting the second computing nodes (22) as a hypercube (28'"'"'), the third network (30) providing a path for message transfer between the second computing nodes (22); a fourth network of message conducting paths (34) for connecting the second computing nodes (22) to the first watch dog node (40) independent from the third network (30) the fourth network (34) providing an independent path for test message and reconfiguration affecting transfers between the second computing nodes (22) and the first watch dog node (40); and, a first multiplexer disposed between the first watch dog node (40) and the second and fourth networks (34) for allowing the first watch dog node (40) to selectively communicate with individual ones of the computing nodes (22) through the second and fourth networks (34); as well as, a second watch dog node (40) operably connected to the first multiplexer whereby the second watch dog node (40) can selectively communicate with individual ones of the computing nodes (22) through the second and fourth networks (34). The branch is completed by a first load balancing node (
The invention described herein was made in the performance of work under a NASA contract and is subject to the provisions of Public Law 96-517 (35 USC 202) in which the Contractor has elected not to retain title.
147 Citations
42 Claims
-
1. A fault-tolerant multi-processor computer system of the hypercube type comprising:
-
(a) a plurality of first computing nodes; (b) a first network of message conducting path means for interconnecting said first computing nodes as a hypercube, said first network providing a path for message transfer between said first computing nodes; (c) a first watch dog node; and
,(d) a second network of message conducting path means for directly connecting each of said first computing nodes to said first watch dog node independent from said first network, said second network providing an independent path for test message and reconfiguration affecting transfers between respective ones of said first computing nodes and said first watch dog node. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A fault-tolerant multi-processor computer system of the hypercube type comprising:
-
(a) a plurality of first computing nodes; (b) a plurality of second computing nodes; (c) a plurality of third computing nodes; (d) a plurality of fourth computing nodes; (e) a first network of message conducting path means for interconnecting said first computing nodes as a hypercube, said first network providing a path for message transfer between said first computing nodes; (f) a second network of message conducting path means for interconnecting said first computing nodes as a hypercube, said second network providing a path for message transfer between said second computing nodes; (g) a third network of message conducting path means for interconnecting said third computing nodes as a hypercube, said third network providing a path for message transfer between said third computing nodes; (h) a fourth network of message conducting path means for interconnecting said fourth computing nodes as a hypercube, said fourth network providing a path for message transfer between said fourth computing nodes; (i) a first watch dog node; (j) a second watch dog node; (k) a third watch dog node; (l) a fourth watch dog node; (m) a fifth network of message conducting path means for directly connecting each of said first computing nodes to said first watch dog node independent from said first network, said fifth network providing an independent path for test message and reconfiguration affecting transfers between respective ones of said first computing nodes and said first watch dog node; (n) a sixth network of message conducting path means for directly connecting each of said second computing nodes to said second watch dog node independent from said second network, said sixth network providing an independent path for test message and reconfiguration affecting transfers between respective ones of said second computing nodes and said second watch dog node; (o) a seventh network of message conducting path means for directly connecting each of said third computing nodes to said third watch dog node independent from said third network, said seventh network providing an independent path for test message and reconfiguration affecting transfers between respective ones of said third computing nodes and said third watch dog node; (p) an eighth network of message conducting path means for directly connecting each of said fourth computing nodes to said fourth watch dog node independent from said fourth network, said eighth network providing an independent path for test message and reconfiguration affecting transfers between respective ones of said fourth computing nodes and said fourth watch dog node; (q) first multiplexer means disposed between said first and second watch dog nodes and said fifth and sixth networks for allowing said first and second watch dog nodes to selectively communicate directly with individual ones of said first and second computing nodes through said fifth and sixth networks; (r) second multiplexer means disposed between third and fourth watch dog nodes and said seventh and eighth networks for allowing said third and fourth watch dog nodes to selectively communicate directly with individual ones of said third and fourth computing nodes through said seventh and eighth networks; (s) a first load balancing node; (t) a second load balancing node; (u) third multiplexer means connected between said first load balancing node and said first and second watch dog nodes for allowing said first load balancing node to selectively communicate directly with individual ones of said first and second watch dog nodes; (v) fourth multiplexer means connected between said second load balancing node and said third and fourth watch dog nodes for allowing said second load balancing node to selectively communicate directly with individual ones of said third and fourth watch dog nodes; (w) a host computer; and (x) a ninth network of message conducting path means connecting said host computer to said first and second load balancing nodes for providing an independent path for message transfer between said host computer, said first load balancing node and said second load balancing node. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. In a fault-tolerant multi-processor computer system of the hypercube type comprising a plurality of computing nodes and a watch dog node, the improved method of operation comprising the steps of:
-
(a) connecting a first network of message conducting paths to interconnect the computing nodes as a hypercube and provide a path for message transfer between said computing nodes; (b) connecting a second network of message conducting paths to directly connect each of the computing nodes to the watch dog node to provide an independent path for test message and reconfiguration affecting transfers between respective ones of the computing nodes and the watch dog node; (c) employing the first network for all message transfers between the computing nodes; and
,(d) employing the second network for all test message and reconfiguration affecting transfers between respective ones of the computing nodes and the watch dog node. - View Dependent Claims (31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
Specification