De-duplication in a virtualized storage environment
First Claim
1. A host system configured to write data to, and read data from, a portion of a pooled storage capacity, and the host system comprising:
- a data de-duplication application operable to de-duplicate data in the portion of the pooled storage capacity, wherein a data de-duplication operation performed by the data de-duplication application comprises;
identifying first data in the pooled storage capacity that is identical to second data in the pooled storage capacity;
deleting the second data from the pooled storage capacity; and
replacing the second data with the pointer pointing to the identical first data; and
a virtualization agent that is operable to;
pool storage capacity by virtualizing a plurality of storage devices; and
present a representation of the portion of the pooled storage capacity to the data de-duplication application.
9 Assignments
0 Petitions
Accused Products
Abstract
A data de-duplication application de-duplicates redundant data in the pooled storage capacity of a virtualized storage environment. The virtualized storage environment includes a plurality of storage devices and a virtualization or abstraction layer that aggregates all or a portion of the storage capacity of each storage device into a single pool of storage capacity, all or portions of which can be allocated to one or more host systems. For each host system, the virtualization layer presents a representation of at least a portion of the pooled storage capacity wherein the corresponding host system can read and write data. The data de-duplication application identifies redundant data in the pooled storage capacity and replaces it with one or more pointers pointing to a single instance of the data. The de-duplication application can operate on fixed or variable size blocks of data and can de-duplicate data either post-process or in-line.
12 Citations
17 Claims
-
1. A host system configured to write data to, and read data from, a portion of a pooled storage capacity, and the host system comprising:
-
a data de-duplication application operable to de-duplicate data in the portion of the pooled storage capacity, wherein a data de-duplication operation performed by the data de-duplication application comprises; identifying first data in the pooled storage capacity that is identical to second data in the pooled storage capacity; deleting the second data from the pooled storage capacity; and replacing the second data with the pointer pointing to the identical first data; and a virtualization agent that is operable to; pool storage capacity by virtualizing a plurality of storage devices; and present a representation of the portion of the pooled storage capacity to the data de-duplication application. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer architecture, comprising:
-
a host system configured to write data to, and read data from, a portion of a pooled storage capacity, and the host system comprising; a data de-duplication application operable to de-duplicate data in the portion of the pooled storage capacity, wherein a data de-duplication operation performed by the data de-duplication application comprises; identifying first data in the pooled storage capacity that is identical to second data in the pooled storage capacity; deleting the second data from the pooled storage capacity; and replacing the second data with the pointer pointing to the identical first data; a virtualization layer that is operable to; pool storage capacity by virtualizing a plurality of storage devices; and present a representation of the portion of the pooled storage capacity to the data de-duplication application; and one of; a storage platform configured to communicate with a plurality of host systems including the host system, wherein the storage platform provides the virtualization layer;
ora switch configured to communicate with a plurality of host systems including the host system, wherein the switch provides the virtualization layer. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer architecture, comprising:
-
a host system configured to write data to, and read data from, a portion of a pooled storage capacity, and the host system comprising; a data de-duplication application operable to de-duplicate data in the portion of the pooled storage capacity, wherein a data de-duplication performed by operation of the data de-duplication application comprises; identifying first data in the pooled storage capacity that is identical to second data in the pooled storage capacity; deleting the second data from the pooled storage capacity; and replacing the second data with the pointer pointing to the identical first data; a thin virtualization driver that is operable to; pool storage capacity by virtualizing a plurality of storage devices; and present a representation of the portion of the pooled storage capacity to the data de-duplication application; and an out-of-band appliance operable to provide storage mappings to the thin virtualization driver. - View Dependent Claims (16, 17)
-
Specification