Virtual message persistence service
First Claim
1. An apparatus, comprising:
- an interface to communicate with client nodes and at least one distributed data repository node over a computer network;
a memory that stores a content map and at least one index map, the content map comprising one or more content map entries, each content map entry comprising a unique identifier and one or more record chunks associated with the unique identifier, each of the one or more record chunks comprising a binary data object, the at least one index map comprising one or more index map entries, each index map entry comprising a unique identifier corresponding to the one or more record chunks maintained in each content map entry of the content map and one or more record attribute values associated with corresponding ones of the binary data objects of the one or more record chunks maintained in each content map entry;
a mapping module that;
receives a request to insert a record from a first client node;
generates a unique identifier in response to the received request;
transmits the generated unique identifier to the first client node;
receives an insertion message including the transmitted unique identifier and at least one record attribute value of the record;
stores the at least one record attribute value in the received insertion message in the stored at least one index map at an index map entry in association with the unique identifier in the received insertion message;
receives record chunks of a data stream corresponding to the unique identifier in the received insertion message from the first client node;
stores the received record chunks in the stored content map at a content map entry in association with the unique identifier in the received insertion message;
receives a query from a second client node;
accesses one or more of the at least one index map to identify a unique identifier corresponding to one or more record chunks that satisfies the query;
provides, to the second client node, the record chunks associated with the identified unique identifier; and
streams additional record chunks associated with the identified unique identifier to the second client node as they are received from a third client node;
synchronizes record attribute values in the at least one index map with record attribute values of at least one index map maintained by the at least one distributed data repository node.
16 Assignments
0 Petitions
Accused Products
Abstract
Methods, apparatuses and systems directed to a distributed data repository system including a plurality of symmetric data repository nodes. In certain embodiments of the present invention, the distributed data repository system is message-centric operative to store message payloads transmitted from client nodes. In certain embodiments, the distributed data repository system is BLOB-centric, maintaining binary data objects and indexes of attribute values that map to the binary data objects. Of course, the present invention can be utilized to store a great variety of digital data contained in message payloads. According to certain embodiments of the present invention, the attribute indexes are fully replicated across all data repository nodes, while the message payloads (e.g., data objects or other content) are exchanged across data repository nodes as needed to fulfill client queries. In this manner, each data repository node in the distributed system can fulfill any client request, while reducing the storage and memory requirements for each data repository node. The reduced storage and computational requirements enable each distributed data repository node to be hosted by an inexpensive hardware platform and, therefore, allow for the deployment of large numbers of distributed data repository nodes to achieve a distributed data repository system featuring high availability and reliability. In certain embodiments, each distributed data repository node is further equipped to act as an instant messaging (or other one-way messaging) server to allow client nodes to establish instant messaging connections with the data repository nodes in the distributed system.
87 Citations
10 Claims
-
1. An apparatus, comprising:
-
an interface to communicate with client nodes and at least one distributed data repository node over a computer network; a memory that stores a content map and at least one index map, the content map comprising one or more content map entries, each content map entry comprising a unique identifier and one or more record chunks associated with the unique identifier, each of the one or more record chunks comprising a binary data object, the at least one index map comprising one or more index map entries, each index map entry comprising a unique identifier corresponding to the one or more record chunks maintained in each content map entry of the content map and one or more record attribute values associated with corresponding ones of the binary data objects of the one or more record chunks maintained in each content map entry; a mapping module that; receives a request to insert a record from a first client node; generates a unique identifier in response to the received request; transmits the generated unique identifier to the first client node; receives an insertion message including the transmitted unique identifier and at least one record attribute value of the record; stores the at least one record attribute value in the received insertion message in the stored at least one index map at an index map entry in association with the unique identifier in the received insertion message; receives record chunks of a data stream corresponding to the unique identifier in the received insertion message from the first client node; stores the received record chunks in the stored content map at a content map entry in association with the unique identifier in the received insertion message; receives a query from a second client node; accesses one or more of the at least one index map to identify a unique identifier corresponding to one or more record chunks that satisfies the query; provides, to the second client node, the record chunks associated with the identified unique identifier; and streams additional record chunks associated with the identified unique identifier to the second client node as they are received from a third client node; synchronizes record attribute values in the at least one index map with record attribute values of at least one index map maintained by the at least one distributed data repository node. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A distributed data repository system, comprising
at least two distributed repository nodes, each distributed repository node comprising: -
a memory that stores a content map and at least one index map, the content map containing at least one message payload stored in association with a message payload identifier, wherein the at least one message payload comprises a binary data object, and the at least one index map containing at least one content attribute value associated with a corresponding binary data object and a corresponding message payload identifier; wherein each distributed repository node further comprises a mapping module that; receives a request to insert a record from a first client node; generates a first unique message payload identifier in response to the received request; transmits the generated first unique message payload identifier to the first client node; receives, from the first client node, an insertion message including the transmitted first unique message payload identifier and at least one content attribute value of the record; stores the at least one record attribute value in the received insertion message in the at least one index map index map at an entry in association with the first unique message payload identifier in the received insertion message; receives, from the first client node, message payloads of a data stream corresponding to the first unique message payload identifier in the received insertion message; stores the received message payloads in the content map at an entry in association with the first unique message payload identifier in the received insertion message; receives a query from a second client node; accesses one or more of the stored at least one index map to identify a unique message payload identifier corresponding to one or more message payloads that satisfies the query; provides, to the second client node, the one or more message payloads associated with the identified unique message payload identifier; streams additional message payloads associated with the identified unique message payload identifier to the second client node as they are received from a third client node; synchronizes content attribute values in the at least one index map with content attribute values of at least one index map maintained by the at least one other distributed data repository nodes. - View Dependent Claims (7, 8, 9, 10)
-
Specification