Stream-based data deduplication with peer node prediction
First Claim
1. A method operative in an overlay network comprising a sending peer and a receiving peer, wherein the sending and receiving peers provide stream-based data deduplication by examining data that flows through the sending peer and receiving peer and replacing blocks of the data with references that point into data dictionaries associated with each of the peers, the method comprising:
- maintaining a directed cyclic graph in association with the sending peer;
maintaining a directed cyclic graph in association with the receiving peer;
wherein each directed cyclic graph represents temporal and ordered relationships among blocks of data that have been seen in the data stream by the respective peer, the directed cyclic graph being annotated with information from which the respective peer can generate a prediction about blocks of data that are subject to the stream-based data deduplication;
in response to receipt at the receiving peer of a request for a page, the receiving peer generating a hinting request that predicts what blocks of data the sending peer is expected to utilize during stream-based data deduplication of the page;
upon receipt of the hinting request at the sending peer, the sending peer generating a hinting response that predicts what blocks of data are expected to compose the page; and
returning the hinting response to the receiving peer to facilitate a pre-warming operation at the receiving peer during the stream-based data deduplication of the page;
wherein the hinting request and the hinting response are generated in software executing in a hardware element.
1 Assignment
0 Petitions
Accused Products
Abstract
Stream-based data deduplication is provided in a multi-tenant shared infrastructure but without requiring “paired” endpoints having synchronized data dictionaries. Data objects processed by the dedupe functionality are treated as objects that can be fetched as needed. As such, a decoding peer does not need to maintain a symmetric library for the origin. Rather, if the peer does not have the chunks in cache that it needs, it follows a conventional content delivery network procedure to retrieve them. In this way, if dictionaries between pairs of sending and receiving peers are out-of-sync, relevant sections are then re-synchronized on-demand. The approach does not require that libraries maintained at a particular pair of sender and receiving peers are the same. Rather, the technique enables a peer, in effect, to “backfill” its dictionary on-the-fly. On-the-wire compression techniques are provided to reduce the amount of data transmitted between the peers.
-
Citations
13 Claims
-
1. A method operative in an overlay network comprising a sending peer and a receiving peer, wherein the sending and receiving peers provide stream-based data deduplication by examining data that flows through the sending peer and receiving peer and replacing blocks of the data with references that point into data dictionaries associated with each of the peers, the method comprising:
-
maintaining a directed cyclic graph in association with the sending peer; maintaining a directed cyclic graph in association with the receiving peer; wherein each directed cyclic graph represents temporal and ordered relationships among blocks of data that have been seen in the data stream by the respective peer, the directed cyclic graph being annotated with information from which the respective peer can generate a prediction about blocks of data that are subject to the stream-based data deduplication; in response to receipt at the receiving peer of a request for a page, the receiving peer generating a hinting request that predicts what blocks of data the sending peer is expected to utilize during stream-based data deduplication of the page; upon receipt of the hinting request at the sending peer, the sending peer generating a hinting response that predicts what blocks of data are expected to compose the page; and returning the hinting response to the receiving peer to facilitate a pre-warming operation at the receiving peer during the stream-based data deduplication of the page; wherein the hinting request and the hinting response are generated in software executing in a hardware element. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method operative in an overlay network comprising a sending peer and a receiving peer, wherein the sending and receiving peers provide stream-based data deduplication by examining data that flows through the sending peer and receiving peer and replacing blocks of the data with references that point into data dictionaries associated with each of the peers, the sending peer associated with an origin, and the receiving peer associated with an overlay network edge, the method comprising:
-
maintaining a directed cyclic graph in association with the sending peer; maintaining a directed cyclic graph in association with the receiving peer; wherein each directed cyclic graph represents temporal and ordered relationships among blocks of data that have been seen in the data stream by the respective peer, the directed cyclic graph being annotated with information from which the respective peer can generate a prediction about blocks of data that are subject to the stream-based data deduplication; using the annotated directed cyclic graphs to enforce a compression protocol across the sending and receiving peers wherein, in response to receipt at the receiving peer of a request for a page hosted at the origin, the page and the embedded objects of the page are pre-warmed into the receiving peer and delivered to a requested client in one round trip as measured from the requesting client to the origin; wherein the compression protocol is carried out in software executing in one or more hardware elements. - View Dependent Claims (10, 11, 12, 13)
-
Specification