Efficient deduplicated data storage with tiered indexing
First Claim
Patent Images
1. A computer-implemented method comprising:
- accessing, at a server, a dedupe entry in a dedupe database, the dedupe database stored in a first storage, the dedupe entry comprising a reference count and a first checksum, the first checksum computed from a block data entry;
determining if the dedupe entry satisfies an indexing condition, the indexing condition comprising a comparison of the reference count against a watermark cutoff counter;
responsive to the dedupe entry satisfying the indexing condition, creating a dedupe index entry, the dedupe index entry comprising a copy of the first checksum, and storing the dedupe index entry in a hyperindex, the hyperindex stored in a second storage;
receiving a request from a user client to store user data, the request comprising a second checksum computed from at least a portion of the user data, the second checksum equal to the first checksum; and
responsive to receiving the request, locating the dedupe index entry by matching the second checksum to the copy of the first checksum in the dedupe index entry.
3 Assignments
0 Petitions
Accused Products
Abstract
A deduplicated data storage system provides high performance storage to heterogeneous clients that connect to it via a communications network. The deduplicated data storage system provides fast access to deduplication data by caching the most frequently accessed deduplication data in a hyperindex. Updates to the non-cached deduplication data are serialized by use of a store queue and hold queue.
32 Citations
23 Claims
-
1. A computer-implemented method comprising:
-
accessing, at a server, a dedupe entry in a dedupe database, the dedupe database stored in a first storage, the dedupe entry comprising a reference count and a first checksum, the first checksum computed from a block data entry; determining if the dedupe entry satisfies an indexing condition, the indexing condition comprising a comparison of the reference count against a watermark cutoff counter; responsive to the dedupe entry satisfying the indexing condition, creating a dedupe index entry, the dedupe index entry comprising a copy of the first checksum, and storing the dedupe index entry in a hyperindex, the hyperindex stored in a second storage; receiving a request from a user client to store user data, the request comprising a second checksum computed from at least a portion of the user data, the second checksum equal to the first checksum; and responsive to receiving the request, locating the dedupe index entry by matching the second checksum to the copy of the first checksum in the dedupe index entry. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer for data storage, the computer comprising:
-
a non-transitory computer-readable storage medium storing executable computer program instructions for; accessing a dedupe entry in a dedupe database, the dedupe database stored in a first storage, the dedupe entry comprising a reference count and a first checksum, the first checksum computed from a block data entry; determining if the dedupe entry satisfies an indexing condition, the indexing condition comprising a comparison of the reference count against a watermark cutoff counter; responsive to the dedupe entry satisfying the indexing condition, creating a dedupe index entry, the dedupe index entry comprising a copy of the first checksum, and storing the dedupe index entry in a hyperindex, the hyperindex stored in a second storage; receiving a request from a user client to store user data, the request comprising a second checksum computed from at least a portion of the user data, the second checksum equal to the first checksum; and responsive to receiving the request, locating the dedupe index entry by matching the second checksum to the copy of the first checksum in the dedupe index entry; and a processor for executing the computer program instructions. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer-implemented method comprising:
-
receiving at a server, a request from a client device to store data, the request comprising a request checksum computed from the data; accessing a hyperindex stored in a first storage, the hyperindex having a plurality of dedupe index entries and each dedupe index entry comprising a stored checksum; searching the hyperindex for a matching dedupe index entry with a stored checksum equal to the request checksum; and responsive to not finding a matching dedupe index entry in the hyperindex; accessing a dedupe database stored in a second storage, the dedupe database comprising a plurality of dedupe entries, and each dedupe entry comprising a stored checksum; searching the dedupe database for a matching dedupe entry with a stored checksum equal to the request checksum; and responsive to not finding a matching dedupe entry in the dedupe database, storing the data in a block data store and adding a store queue entry to a store queue, the store queue entry comprising the request checksum. - View Dependent Claims (22, 23)
-
Specification