Systems and methods for data processing
First Claim
1. A method for data processing comprising:
- calculating first digest values associated with first contents of a plurality of first data points, the plurality of first data points including a second data point and one or more third data points;
comparing a second digest value associated with a second content of the second data point with one or more third digest values associated with third contents of the one or more third data points, the one or more third data points preceding the second data point;
in response to the second digest value being the same as a fourth digest value associated with a fourth content of a fourth data point, deleting the second content of the second data point, the fourth data point being within the one or more third data points; and
establishing a mapping between the second digest value and the fourth content,wherein the method further comprises;
classifying data points into a plurality of data-point groups according to different content sizes of the data points, the classified data points for each data-point group having a same content size; and
assigning a plurality of fifth data points into a given data-point group, andwherein the comparing is performed between digest values of contents of the plurality of fifth data points that belong to the given data-point group, and not performed between a digest value of a content of a fifth data point and a digest value of a content of a data point that belongs to another data-point group.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods are provided for data processing. For example, first digest values associated with first contents of a plurality of first data points are calculated, the plurality of first data points including a second data point and one or more third data points; a second digest value associated with a second content of the second data point is compared with one or more third digest values associated with third contents of the third data points, the third data points preceding the second data point; in response to the second digest value being the same as a fourth digest value associated with a fourth content of a fourth data point, the second content of the second data point is deleted, the fourth data point being within the one or more third data points; and a mapping between the second digest value and the fourth content is established.
-
Citations
12 Claims
-
1. A method for data processing comprising:
-
calculating first digest values associated with first contents of a plurality of first data points, the plurality of first data points including a second data point and one or more third data points; comparing a second digest value associated with a second content of the second data point with one or more third digest values associated with third contents of the one or more third data points, the one or more third data points preceding the second data point; in response to the second digest value being the same as a fourth digest value associated with a fourth content of a fourth data point, deleting the second content of the second data point, the fourth data point being within the one or more third data points; and establishing a mapping between the second digest value and the fourth content, wherein the method further comprises; classifying data points into a plurality of data-point groups according to different content sizes of the data points, the classified data points for each data-point group having a same content size; and assigning a plurality of fifth data points into a given data-point group, and wherein the comparing is performed between digest values of contents of the plurality of fifth data points that belong to the given data-point group, and not performed between a digest value of a content of a fifth data point and a digest value of a content of a data point that belongs to another data-point group. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A data-processing device comprising:
-
a calculation unit configured to calculate first digest values associated with first contents of a plurality of first data points, the plurality of first data points including a second data point and one or more third data points; a comparison unit configured to compare a second digest value associated with a second content of the second data point with one or more third digest values associated with third contents of the one or more third data points, the one or more third data points preceding the second data point; a deletion unit configured to, in response to the second digest value being the same as a fourth digest value associated with a fourth content of a fourth data point, delete the second content of the second data point, the fourth data point being within the one or more third data points; and a mapping unit configured to establish a mapping between the second digest value and the fourth content, wherein the device further comprises a classification unit configured to classify data points into a plurality of data-point groups according to different content sizes of the data points, the classified data points for each data-point group having a same content size, and the classification unit further configured to assign a plurality of fifth data points into a given data-point group, and wherein the comparison unit is configured to compare between digest values of contents of the plurality of fifth data points that belong to the given data-point group, and not compare between a digest value of a content of a fifth data point and a digest value of a content of a data point that belongs to another data-point group. - View Dependent Claims (7, 8, 9, 10, 11)
-
-
12. A non-transitory computer readable storage medium comprising programming instructions for data processing, the programming instructions configured to cause one or more data processors to execute operations comprising:
-
calculating first digest values associated with first contents of a plurality of first data points, the plurality of first data points including a second data point and one or more third data points; comparing a second digest value associated with a second content of the second data point with one or more third digest values associated with third contents of the one or more third data points, the one or more third data points preceding the second data point; in response to the second digest value being the same as a fourth digest value associated with a fourth content of a fourth data point, deleting the second content of the second data point, the fourth data point being within the one or more third data points; and establishing a mapping between the second digest value and the fourth content, wherein the programming instructions are further configured to cause the one or more data processors to; classify data points into a plurality of data-point groups according to different content sizes of the data points, the classified data points for each data-point group having a same content size; and assign a plurality of fifth data points into a given data-point group, and wherein the comparing is performed between digest values of contents of the plurality of fifth data points that belong to the given data-point group, and not performed between a digest value of a content of a fifth data point and a digest value of a content of a data point that belongs to another data-point group.
-
Specification