Reliable map-reduce communications in a decentralized, self-organizing communication orbit of a distributed network
First Claim
1. A method of providing message communications with failure detection and recovery in a linear communication orbit formed by a non-static collection of machines, the method comprising:
- at a respective machine of the non-static collection of machines forming the linear communication orbit;
identifying, from among the non-static collection of machines, a respective set of forward contacts that comprises a set of machines distributed in a forward direction along the linear communication orbit;
monitoring a respective propagation state of a first query that has departed from the respective machine to travel in the forward direction along the linear communication orbit, wherein the monitoring includes updating the respective propagation state of the first query based on a predetermined timeout for the respective propagation state; and
upon detecting a respective propagation failure of the first query based on the monitoring, sending the first query directly to a first forward contact among the set of forward contacts to initiate a respective failure recovery process within at least part of a respective segment of the linear communication orbit between the respective machine and the first forward contact of the respective machine.
1 Assignment
0 Petitions
Accused Products
Abstract
Method and system for providing message communications with failure detection and recovery are disclosed. At a respective node of a non-static collection of nodes forming a linear communication orbit: the node identifies, from among the non-static collection of nodes, a set of forward contacts distributed in a forward direction along the linear communication orbit; the node monitors a propagation state of a first query that has departed from the respective node to travel in the forward direction along the linear communication orbit; and upon detecting a propagation failure of the first query based on the monitoring, the node sends the first query directly to a first forward contact among the set of forward contacts to initiate a failure recovery process within at least part of a segment of the linear communication orbit between the respective node and the first forward contact of the respective node.
61 Citations
23 Claims
-
1. A method of providing message communications with failure detection and recovery in a linear communication orbit formed by a non-static collection of machines, the method comprising:
at a respective machine of the non-static collection of machines forming the linear communication orbit; identifying, from among the non-static collection of machines, a respective set of forward contacts that comprises a set of machines distributed in a forward direction along the linear communication orbit; monitoring a respective propagation state of a first query that has departed from the respective machine to travel in the forward direction along the linear communication orbit, wherein the monitoring includes updating the respective propagation state of the first query based on a predetermined timeout for the respective propagation state; and upon detecting a respective propagation failure of the first query based on the monitoring, sending the first query directly to a first forward contact among the set of forward contacts to initiate a respective failure recovery process within at least part of a respective segment of the linear communication orbit between the respective machine and the first forward contact of the respective machine. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 23)
-
17. A non-transitory computer-readable medium, having instructions stored thereon, which when executed by one or more processors cause the processors to perform operations comprising:
-
at a respective machine of a non-static collection of machines forming a linear communication orbit; identifying, from among the non-static collection of machines, a respective set of forward contacts that comprises a set of machines distributed in a forward direction along the linear communication orbit; monitoring a respective propagation state of a first query that has departed from the respective machine to travel in the forward direction along the linear communication orbit, wherein the monitoring includes updating the respective propagation state of the first query based on a predetermined timeout for the respective propagation state; and upon detecting a respective propagation failure of the first query based on the monitoring, sending the first query directly to a first forward contact among the set of forward contacts to initiate a respective failure recovery process within at least part of a respective segment of the linear communication orbit between the respective machine and the first forward contact of the respective machine. - View Dependent Claims (18, 19)
-
-
20. A system, comprising:
-
one or more processors; and memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform operations comprising; at a respective machine of a non-static collection of machines forming a linear communication orbit; identifying, from among the non-static collection of machines, a respective set of forward contacts that comprises a set of machines distributed in a forward direction along the linear communication orbit; monitoring a respective propagation state of a first query that has departed from the respective machine to travel in the forward direction along the linear communication orbit, wherein the monitoring includes updating the respective propagation state of the first query based on a predetermined timeout for the respective propagation state; and upon detecting a respective propagation failure of the first query based on the monitoring, sending the first query directly to a first forward contact among the set of forward contacts to initiate a respective failure recovery process within at least part of a respective segment of the linear communication orbit between the respective machine and the first forward contact of the respective machine. - View Dependent Claims (21, 22)
-
Specification