Data processing apparatus and method of processing data
First Claim
1. A data processing apparatus comprising:
- a chunk store containing specimen data chunks,a manifest store containing a plurality of manifests, each of which represents at least a part of a data set and each of which comprises at least one reference to at least one of said specimen data chunks,a sparse chunk index containing information on only some specimen data chunks,at least one processor to;
process input data into input data chunks;
identify manifests having at least one reference to one of said specimen data chunks that corresponds to one of said input data chunks and on which there is information contained in the sparse chunk index, wherein for a particular specimen data chunk indexed by the sparse chunk index, the sparse chunk index contains information on the most recent predetermined number of manifests added to the manifest store that comprise a reference to said particular specimen data chunk, and wherein the sparse chunk index further contains information for manifests that reference other specimen data chunks; and
prioritize the identified manifests for subsequent operation.
2 Assignments
0 Petitions
Accused Products
Abstract
Data processing apparatus comprising: a chunk store containing specimen data chunks, a manifest store containing a plurality of manifests, each of which represents at least a part of a data set and each of which comprises at least one reference to at least one of said specimen data chunks, a sparse chunk index containing information on only some specimen data chunks, the processor being operable to: process input data into input data chunks; identify manifests having at least one reference to one of said specimen data chunks that corresponds to one of said input data chunks and on which there is information contained in the sparse chunk index; and prioritize the identified manifests for subsequent operation.
-
Citations
23 Claims
-
1. A data processing apparatus comprising:
-
a chunk store containing specimen data chunks, a manifest store containing a plurality of manifests, each of which represents at least a part of a data set and each of which comprises at least one reference to at least one of said specimen data chunks, a sparse chunk index containing information on only some specimen data chunks, at least one processor to; process input data into input data chunks; identify manifests having at least one reference to one of said specimen data chunks that corresponds to one of said input data chunks and on which there is information contained in the sparse chunk index, wherein for a particular specimen data chunk indexed by the sparse chunk index, the sparse chunk index contains information on the most recent predetermined number of manifests added to the manifest store that comprise a reference to said particular specimen data chunk, and wherein the sparse chunk index further contains information for manifests that reference other specimen data chunks; and prioritize the identified manifests for subsequent operation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 18)
-
-
12. A data processing apparatus comprising:
-
a chunk store containing specimen data chunks; a manifest store containing a plurality of manifests, each of which represents at least a part of a data set and each of which comprises at least one reference to at least one of said specimen data chunks; a sparse chunk index containing information on only some specimen data chunks; at least one processor to; identify manifests in the sparse chunk index, wherein each of the identified manifests includes at least one reference to a specimen data chunk in a chunk store corresponding to at least one input data chunk in an input data set, wherein for a particular specimen data chunk indexed by the sparse chunk index, the sparse chunk index contains information on the most recent predetermined number of manifests added to the manifest store that comprise a reference to said particular specimen data chunk, and wherein the sparse chunk index further contains information for manifests that reference other specimen data chunks; prioritize the identified manifests for subsequent operation according to a number of said corresponding specimen data chunks referenced by each of the identified manifests. - View Dependent Claims (13, 19, 20)
-
-
14. A data processing apparatus, comprising:
-
a chunk store containing specimen data chunks; a manifest store containing manifests, each of which represents at least a part of a data set and comprise at least one reference to at least one of said specimen data chunks; a sparse chunk index containing information on only some specimen data chunks; and at least one processor to; process input data into input data chunks; identify manifests each having at least one reference to one of said specimen data chunks that corresponds to one of said input data chunks and on which there is information contained in the sparse chunk index, wherein for a particular specimen data chunk indexed by the sparse chunk index, the sparse chunk index contains information on the most recent predetermined number of manifests added to the manifest store that comprise a reference to said particular specimen data chunk, and wherein the sparse chunk index further contains information for manifests that reference other specimen data chunks; prioritize the identified manifests for subsequent operation according to a prioritization criterion; in the subsequent operation, use the prioritized identified manifests in an order according to the prioritizing; and subsequently manage the information contained in the sparse chunk index.
-
-
15. A method of processing data, comprising:
-
providing a chunk store containing specimen data chunks; providing a manifest store containing manifests representing at least a part of a data set, each of the manifests comprising at least one reference to at least one of said specimen data chunks; and providing a sparse chunk index containing information on only some specimen data chunks, wherein for a particular specimen data chunk indexed by the sparse chunk index, the sparse chunk index contains information on the most recent predetermined number of manifests added to the manifest store that comprise a reference to said particular specimen data chunk, and wherein the sparse chunk index further contains information for manifests that reference other specimen data chunks; processing input data into input data chunks; identifying manifests having at least one reference to one of said specimen data chunks that corresponds to one of said input data chunks and on which there is information contained in the sparse chunk index; and prioritizing the identified manifests according to a prioritization criterion. - View Dependent Claims (16, 21, 22)
-
-
17. A method of data processing, comprising:
-
providing a chunk store containing specimen data chunks; providing a manifest store containing manifests representing at least a part of a data set, each of the manifests comprising at least one reference to at least one of said specimen data chunks; and providing a sparse chunk index containing information on only some specimen data chunks; identifying manifests in the sparse chunk index, wherein each of the identified manifests includes at least one reference to a specimen data chunk in a chunk store corresponding to at least one input data chunk in an input data set, wherein for a particular specimen data chunk indexed by the sparse chunk index, the sparse chunk index contains information on the most recent predetermined number of manifests added to the manifest store that comprise a reference to said particular specimen data chunk, and wherein the sparse chunk index further contains information for manifests that reference other specimen data chunks; and prioritizing the identified manifests for subsequent operation according to a number of corresponding specimen data chunks referenced by each of the identified manifests. - View Dependent Claims (23)
-
Specification