Optimizing Data Transmission Bandwidth Consumption Over a Wide Area Network
First Claim
1. A method for optimizing data transmission bandwidth consumption over a wide area network, comprising:
- partitioning a data message to be communicated from a first data site to a second data site into a plurality of data chunks;
generating a data chunk identifier for each of the plurality of data chunks;
determining whether the plurality of data chunks are stored at the second data site;
when at least one data chunk is not stored at the second data site, adding the data chunk identifier for each data chunk not stored at the second data site to a data structure at the first data site; and
sending a transformed data message from the first date site to the second data site, wherein the transformed data message comprises;
when at least one of the plurality of data chunk is stored at the second data site, at least one tuple, wherein the at least one tuple is to be used to reconstruct the data message, andwhen at least one data chunk is not stored at the second data site, the at least one data chunk not stored at the second site.
1 Assignment
0 Petitions
Accused Products
Abstract
An exemplary embodiment includes partitioning a data message to be communicated from a first data site to a second data site into data chunks; generating a data chunk identifier for each data chunk; determining whether the data chunks are stored at the second data site; when at least one data chunk is not stored at the second data site, adding the data chunk identifier for each data chunk not stored at the second data site to a data structure at the first data site; sending a transformed data message from the first date site to the second data site; wherein, when at least one data chunk is already stored at the second data site, rather than including that data chunk, the transformed data message instead includes at least one tuple to enable the data message to be reconstructed at the second data site without sending the previously stored data chunk, the transformed data message also includes each data chunk not stored at the second data site.
-
Citations
24 Claims
-
1. A method for optimizing data transmission bandwidth consumption over a wide area network, comprising:
-
partitioning a data message to be communicated from a first data site to a second data site into a plurality of data chunks; generating a data chunk identifier for each of the plurality of data chunks; determining whether the plurality of data chunks are stored at the second data site; when at least one data chunk is not stored at the second data site, adding the data chunk identifier for each data chunk not stored at the second data site to a data structure at the first data site; and sending a transformed data message from the first date site to the second data site, wherein the transformed data message comprises; when at least one of the plurality of data chunk is stored at the second data site, at least one tuple, wherein the at least one tuple is to be used to reconstruct the data message, and when at least one data chunk is not stored at the second data site, the at least one data chunk not stored at the second site. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for optimizing data transmission bandwidth consumption over a network, comprising:
-
receiving a transformed data message at a second data site; and when the transformed data message comprises at least one data chunk, generating a data chunk identifier for each data chunk in the transformed data message, adding the data chunk identifier for each data chunk in the transformed data message to a data structure at the second data site, and storing each data chunk in the transformed message in a storage repository at the second data site. - View Dependent Claims (11, 12, 13)
-
-
14. A system for optimizing data transmission bandwidth consumption over a wide area network, comprising:
-
a data structure at a first data site configured to store a plurality of data chunk identifiers; and a data deduplication node at the first data site, the deduplication node comprises; a data partition module configured to partition a data message to be communicated from the first data site to a second data site into a plurality of data chunks, a data chunk identifier generation module coupled to the data partition module and configured to generate a data chunk identifier for each of the plurality of data chunks, a determination module coupled to the data chunk identifier generation module and configured to determine whether the plurality of data chunks are stored at the second data site, a data structure management module coupled to the determination module and configured to add the data chunk identifier for each data chunk not stored at the second data site to the data structure at the first data site when at least one of the plurality of data chunk is not stored at the second data site, and a transmission module coupled to the data structure management module and configured to send a transformed data message from the first date site to the second data site, wherein the transformed data message comprises; when at least one data chunk is stored at the second data site, at least one tuple, wherein the at least one tuple is to be used to reconstruct the data message, and when at least one data chunk is not stored at the second data site, the at least one data chunk not stored at the second site. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
-
22. A computer program product for optimizing data transmission bandwidth consumption over a wide area network, comprising:
a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising; computer readable program code configured to partition a data message to be communicated from a first data site to a second data site into a plurality of data chunks, computer readable program code configured to generate a data chunk identifier for each of the plurality of data chunks, computer readable program code configured to determine whether the plurality of data chunks are stored at the second data site, computer readable program code configured to add the data chunk identifier for each data chunk not stored at the second data site to a data structure at the first data site, when at least one data chunk is not stored at the second data site; and computer readable program code configured to send a transformed data message from the first date site to the second data site, wherein the transformed data message comprises; when at least one data chunk is stored is stored at the second data site, at least one tuple, wherein the at least one tuple is to be used to reconstruct the data message, and when at least one data chunk is not stored at the second data site, the at least one data chunk not stored at the second site. - View Dependent Claims (23, 24)
Specification