METHODS AND SYSTEMS FOR QUICK AND EFFICIENT DATA MANAGEMENT AND/OR PROCESSING
First Claim
1. A method of data management, comprising:
- breaking a data stream into a plurality of data groups using a combination of a first data segmentation procedure and a second data segmentation procedure, wherein expected average data group size of the first data segmentation procedure and the second data segmentation procedure is different.
3 Assignments
0 Petitions
Accused Products
Abstract
System(s) and method(s) are provided for data management and data processing. For example, various embodiments may include systems and methods relating to relatively larger groups of data being selected with comparable or better performing selection results (e.g., high data redundancy elimination and/or average chunk size). In various embodiments, the system(s) and method(s) may include, for example a data group, block, or chunk combining technique or/and a data group, block, or chunk splitting technique. Various embodiments may include a first standard or typical data grouping, blocking, or chunking technique and/or data group, block, or chunk combining technique or/and a data group, block, or chunk splitting technique. Exemplary system(s) and method(s) may relate to data hashing and/or data elimination. Embodiments may include a look-ahead buffer and determine whether to emit small chunks or large chunks based on characteristics of underlying data and/or particular application of the invention (e.g., for backup).
-
Citations
35 Claims
-
1. A method of data management, comprising:
breaking a data stream into a plurality of data groups using a combination of a first data segmentation procedure and a second data segmentation procedure, wherein expected average data group size of the first data segmentation procedure and the second data segmentation procedure is different. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
24. A method of data management, comprising:
-
applying a first content-defined data chunking procedure to obtain one or more initial chunking points; and applying a second content-defined data chunking procedure, based on a predetermined set of criteria, so as to modify the initial chunking points to different chunking points thereby increasing an average size of data chunks and an average amount of duplicate data identified. - View Dependent Claims (25, 26)
-
-
27. A method of content-defined chunking, comprising the steps of:
-
amalgamating small chunks into large chunks within long stretches of data that has been determined to be non-duplicate data; bordering the edges within long stretches of data that has been determined to be non-duplicate data regions that are adjacent to data regions that are determined to be duplicate data with small chunks by not amalgamating the small chunks found near the edges; and re-emitting large chunk(s) which are found to be duplicate(s) data.
-
-
28. A system of data management, comprising:
-
a data identification system; and a data manipulation system, wherein the data manipulation system, based on a predetermined set of criteria, selectively modifies one or more initial data break points to so as to increase the average size of data groups. - View Dependent Claims (29, 30, 31)
-
-
32. A system of data management, comprising:
-
means for performing data identification; and means for manipulating data, wherein the means for manipulating data, based on a predetermined set of criteria, selectively modifies one or more initial data break points so as to increase the average size of data groups. - View Dependent Claims (33, 34, 35)
-
Specification