Recovery in a distributed stateful publish-subscribe system
First Claim
1. A method for fault recovery in a stateful publish-subscribe system, the method comprising:
- providing the stateful publish-subscribe system, the system including an overlay network, wherein the overlay network comprises a publisher that first transmits a plurality of structured messages through the overlay network, an object downstream from the publisher that receives the plurality of structured messages as a plurality of input messages from the publisher and applies at least one transform to the plurality of input messages to form at least one output message that is transmitted to a subscriber downstream from the object, and the subscriber that receives the at least one output message from the object;
storing a history of the plurality of structured messages in a stable storage at the publisher;
detecting missing information with respect to a message transmitted downstream through the overlay network from the plurality of input messages to the at least one output message, wherein view objects using protocols detect the missing information, wherein the view objects are represented by a set of values in a monotonic domain, wherein the monotonic domain is the set of values in a partial order, wherein detecting the missing information with respect to a message includes detecting a gap in the set of values, and wherein the gap is an indication that the message is lost or not arrived at all;
transmitting an inquiry message requesting the missing information upstream through the overlay network to the object;
determining whether the missing information is stored in the object using the inquiry message;
responsive to a determination that the missing information is stored in the object, responding to the inquiry message by reapplying the at least one transform to the missing information and transmitting the missing information downstream through the overlay network;
responsive to a determination that the missing information is not stored in the object, transmitting the inquiry message from the object to the publisher;
responding to the inquiry message by retrieving the missing information from the plurality of structured messages in the stable storage at the publisher and transmitting the missing information from the publisher downstream through the overlay network.
1 Assignment
0 Petitions
Accused Products
Abstract
Method, apparatus and computer program product for fault recovery in a distributed stateful publish-subscribe system. The system includes the capability of recovering from failures that may occur when a stateful publish-subscribe service is implemented on an overlay network. Such failures may include, for example, temporary crashes of broker machines, and network errors causing messages to possibly be lost, duplicated or delivered out of order. The system requires stable storage logging only when a published event enters the system, and requires that logged messages be retrieved from stable storage only in the event all brokers between a failed link or broker and the publishing sites have failed. The publish-subscribe system of the present invention does not require that broker-to-broker connections use reliable FIFO protocols, such as TCP/IP, but may advantageously use faster, less reliable protocols.
61 Citations
16 Claims
-
1. A method for fault recovery in a stateful publish-subscribe system, the method comprising:
-
providing the stateful publish-subscribe system, the system including an overlay network, wherein the overlay network comprises a publisher that first transmits a plurality of structured messages through the overlay network, an object downstream from the publisher that receives the plurality of structured messages as a plurality of input messages from the publisher and applies at least one transform to the plurality of input messages to form at least one output message that is transmitted to a subscriber downstream from the object, and the subscriber that receives the at least one output message from the object; storing a history of the plurality of structured messages in a stable storage at the publisher; detecting missing information with respect to a message transmitted downstream through the overlay network from the plurality of input messages to the at least one output message, wherein view objects using protocols detect the missing information, wherein the view objects are represented by a set of values in a monotonic domain, wherein the monotonic domain is the set of values in a partial order, wherein detecting the missing information with respect to a message includes detecting a gap in the set of values, and wherein the gap is an indication that the message is lost or not arrived at all; transmitting an inquiry message requesting the missing information upstream through the overlay network to the object; determining whether the missing information is stored in the object using the inquiry message; responsive to a determination that the missing information is stored in the object, responding to the inquiry message by reapplying the at least one transform to the missing information and transmitting the missing information downstream through the overlay network; responsive to a determination that the missing information is not stored in the object, transmitting the inquiry message from the object to the publisher; responding to the inquiry message by retrieving the missing information from the plurality of structured messages in the stable storage at the publisher and transmitting the missing information from the publisher downstream through the overlay network. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A stateful publish-subscribe system comprising:
-
an overlay network having a plurality of broker machines; at least one publishing client that publishes a plurality of messages to a plurality of published message streams; at least one intermediate client that receives the plurality of messages and applies at least one transform to the plurality of messages to form at least one output message that is transmitted to at least one subscribing client downstream from the intermediate client; at least one subscribing client that requests a view of the at least one output message, wherein at least one of the at least one subscribing client requests a stateful view in which at least one update to the stateful view depends upon more than one of the at least one output message; a hypergraph defining transform objects and view objects deployed over the overlay network; a stable storage for storing the plurality of messages at the publishing client; a first protocol for detecting missing information using the view objects with respect to a message transmitted downstream through the overlay network, wherein the view objects are represented by a set of values in a monotonic domain, wherein the monotonic domain is the set of values in a partial order, wherein detecting the missing information with respect to a message includes detecting a gap in the set of values, and wherein the gap is an indication that the message is lost or not arrived at all; a second protocol for transmitting an inquiry message requesting the missing information upstream through the overlay network to the at least one intermediate client; a third protocol for determining whether the missing information is stored in the at least one intermediate client using the inquiry message; a fourth protocol for, responsive to the third protocol determining that the missing information is stored in the at least one intermediate client, receiving and responding to the inquiry message by reprocessing the missing information using the transform objects, and transmitting the missing information downstream through the overlay network; a fifth protocol for, responsive to the third protocol determining that the missing information is not stored in the at least one intermediate client, transmitting the inquiry message from the at least one intermediate client to the at least one publishing client; a sixth protocol for responding to the inquiry message by retrieving the missing information from the plurality of structured messages in the stable storage at the publishing client and transmitting the missing information from the publishing client downstream through the overlay network.
-
-
16. A computer program product comprising:
-
a computer recordable-type medium including computer usable program code for fault recovery in a stateful publish-subscribe system, the computer program product comprising; computer usable program code for providing the stateful publish-subscribe system, wherein the overlay network comprises a publisher that first transmits a plurality of structured messages through the overlay network, an object downstream from the publisher that receives the plurality of structured messages as a plurality of input messages from the publisher and applies at least one transform to the plurality of input messages to form at least one output message that is transmitted to a subscriber downstream from the object, and the subscriber that receives the at least one output message from the object; computer usable program code for storing a history of the plurality of structured messages in a stable storage at the publisher; computer usable program code for detecting missing information with respect to a message transmitted downstream through the overlay network from the plurality of input messages to the at least one output message, wherein view objects using protocols detect the missing information, wherein the view objects are represented by a set of values in a monotonic domain, wherein the monotonic domain is the set of values in a partial order, wherein detecting the missing information with respect to a message includes detecting a gap in the set of values, and wherein the gap is an indication that the message is lost or not arrived at all; computer usable program code for transmitting an inquiry message requesting the missing information upstream through the overlay network to the object; computer usable program code for determining whether the missing information is stored in the object using the inquiry message; computer usable program code for responding to the inquiry message by reapplying the at least one transform to the determined missing information and transmitting the missing information downstream through the overlay network in response to a determination that the missing information is stored in the object; computer usable program code for transmitting the inquiry message from the object to the publisher in response to a determination that the missing information is not stored in the object; computer usable program code for responding to the inquiry message by retrieving the missing information from the plurality of structured messages in the stable storage at the publisher and the missing information from the publisher downstream through the overlay network.
-
Specification