Efficient data storage system
First Claim
Patent Images
1. A method for storing data comprising:
- receiving a data stream comprising a plurality of data segments; and
determining whether one of the plurality of data segments has been stored previously using a summary in a low latency memory;
in the event that the data segment is determined not to have been stored previously, assigning an identifier to the data segment; and
confirming whether the data segment has been stored previously, wherein confirming whether the data segment has been stored previously includes;
checking a cache; and
in the event that checking a cache results in a cache miss, confirming whether the data segment has been stored previously further includes checking in a segment database.
11 Assignments
0 Petitions
Accused Products
Abstract
A system and method are disclosed for providing efficient data storage. A data stream comprising a plurality of data segments is received. The system determines whether one of the plurality of data segments has been stored previously using a summary in a low latency memory; in the event that the data segment is determined not to have been stored previously, assigning an identifier to the data segment.
-
Citations
4 Claims
-
1. A method for storing data comprising:
-
receiving a data stream comprising a plurality of data segments; and determining whether one of the plurality of data segments has been stored previously using a summary in a low latency memory; in the event that the data segment is determined not to have been stored previously, assigning an identifier to the data segment; and confirming whether the data segment has been stored previously, wherein confirming whether the data segment has been stored previously includes; checking a cache; and in the event that checking a cache results in a cache miss, confirming whether the data segment has been stored previously further includes checking in a segment database.
-
-
2. A method for storing data comprising:
-
receiving a data stream comprising a plurality of data segments; and determining whether one of the plurality of data segments has been stored previously using a summary in a low latency memory; in the event that the data segment is determined not to have been stored previously, assigning an identifier to the data segment; confirming whether the data segment has been stored previously in the event the data segment is determined to have been stored;
whereinconfirming whether the data segment has been stored previously includes checking a cache; and in the event that checking a cache results in a cache miss, confirming whether the data segment has been stored previously further includes checking in a segment database; wherein the segment database is stored in relatively high latency memory.
-
-
3. A data storage device comprising:
-
an input interface adapted to receive a data stream comprising a plurality of data segments; and a segment redundancy check engine configured to; determine whether one of the plurality of data segments has been stored previously using a summary in a low latency memory; in the event that the data segment is determined not to have been stored previously, to assign an identifier to the data segment; generate segment information for each of the plurality of data segments; wherein the summary is a space efficient, probabilistic summary of segment information; and confirm whether the data segment has been stored previously, wherein confirming whether the data segment has been stored previously includes; checking a cache; and in the event that checking a cache results in a cache miss, confirming whether the data segment has been stored previously further includes checking in a segment database.
-
-
4. A data storage device comprising:
-
an input interface adapted to receive a data stream comprising a plurality of data segments; and a segment redundancy check engine configured to; determine whether one of the plurality of data segments has been stored previously using a summary in a low latency memory; in the event that the data segment is determined not to have been stored previously, to assign an identifier to the data segment; generate segment information for each of the plurality of data segments; wherein the summary is a space efficient, probabilistic summary of segment information; and confirm whether the data segment has been stored previously in the event the data segment is determined to have been stored, wherein confirming whether the data segment has been stored previously includes; checking a cache; and in the event that checking a cache results in a cache miss, confirming whether the data segment has been stored previously further includes checking in a segment database, wherein the segment database is stored in relatively high latency memory.
-
Specification