STORAGE APPARATUS WHICH ELIMINATES DUPLICATED DATA IN COOPERATION WITH HOST APPARATUS, STORAGE SYSTEM WITH THE STORAGE APPARATUS, AND DEDUPLICATION METHOD FOR THE SYSTEM
First Claim
1. A storage apparatus comprising:
- a first storage unit configured to store block data items and block identifiers unique to the block data items such that the block data items and the block identifiers are associated with each other;
a second storage unit configured to store addresses of block data items and block identifiers unique to the block data items such that the addresses and the block identifiers are associated with each other;
a control module configured to process requests from a host apparatus, the host apparatus comprising a cache;
a block identifier generation module configured to generate a block identifier unique to a block data item specified by the control module; and
a comparison module configured to compare the block data item specified by the control module with block data items stored in the first storage unit,wherein;
the control module is configured to specify a first block data item for the comparison module when a write request to specify the writing of data into the storage apparatus has been generated at the host apparatus and when a first-type write request including the first block data item and a first address of the first block data item has been transmitted from the host apparatus to the storage apparatus because the first block data item has coincided with none of the block data items stored in the cache of the host apparatus, the data to be written into the storage unit being processed in units of block data items and including the first block data item;
the control module is further configured to (a1) cause the block identifier generation module to generate a first block identifier unique to the first block data item, (a2) store the first block identifier and the first block data item in the first storage unit such that the first block identifier and the first block data item are associated with each other, (a3) store the first address and the first block identifier in the second storage unit such that the first address and the first block identifier are associated with each other and (a4) transmit the first block identifier to the host apparatus in order to cause the host apparatus to store the first block identifier and the first block data item in the cache of the host apparatus such that the first block identifier and the first block data item are associated with each other and to store the first address and the first block identifier in the cache such that the first address and the first block identifier are associated with each other when the result of comparison by the comparison module based on the specification in the first block data item has shown that the first block data item has coincided with none of the block data items stored in the first storage unit; and
the control module is still further configured to store a second address of a second block data item and a second block identifier unique to the second block data item in the second storage unit such that the second address and the second block identifier are associated with each other when the host apparatus has transmitted a second-type write request including the second block identifier and the second address to the storage apparatus because the second block data item has coincided with any one of the block data items stored in the cache of the host apparatus, the second block data item being included in the data to be written.
2 Assignments
0 Petitions
Accused Products
Abstract
According to one embodiment, a storage apparatus includes a first storage unit, a second storage unit and a control module. The control module stores the address of a block data item and a block identifier unique to the block data item, included in a write request, in the second storage unit such that the address and the block identifier are associated with each other when a request to specify the writing of data including the block data item into the storage apparatus has been generated at a host apparatus and when the host apparatus has transmitted the write request because the data item has coincided with any one of the block data items stored in the cache of the host apparatus.
249 Citations
11 Claims
-
1. A storage apparatus comprising:
-
a first storage unit configured to store block data items and block identifiers unique to the block data items such that the block data items and the block identifiers are associated with each other; a second storage unit configured to store addresses of block data items and block identifiers unique to the block data items such that the addresses and the block identifiers are associated with each other; a control module configured to process requests from a host apparatus, the host apparatus comprising a cache; a block identifier generation module configured to generate a block identifier unique to a block data item specified by the control module; and a comparison module configured to compare the block data item specified by the control module with block data items stored in the first storage unit, wherein; the control module is configured to specify a first block data item for the comparison module when a write request to specify the writing of data into the storage apparatus has been generated at the host apparatus and when a first-type write request including the first block data item and a first address of the first block data item has been transmitted from the host apparatus to the storage apparatus because the first block data item has coincided with none of the block data items stored in the cache of the host apparatus, the data to be written into the storage unit being processed in units of block data items and including the first block data item; the control module is further configured to (a1) cause the block identifier generation module to generate a first block identifier unique to the first block data item, (a2) store the first block identifier and the first block data item in the first storage unit such that the first block identifier and the first block data item are associated with each other, (a3) store the first address and the first block identifier in the second storage unit such that the first address and the first block identifier are associated with each other and (a4) transmit the first block identifier to the host apparatus in order to cause the host apparatus to store the first block identifier and the first block data item in the cache of the host apparatus such that the first block identifier and the first block data item are associated with each other and to store the first address and the first block identifier in the cache such that the first address and the first block identifier are associated with each other when the result of comparison by the comparison module based on the specification in the first block data item has shown that the first block data item has coincided with none of the block data items stored in the first storage unit; and the control module is still further configured to store a second address of a second block data item and a second block identifier unique to the second block data item in the second storage unit such that the second address and the second block identifier are associated with each other when the host apparatus has transmitted a second-type write request including the second block identifier and the second address to the storage apparatus because the second block data item has coincided with any one of the block data items stored in the cache of the host apparatus, the second block data item being included in the data to be written. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A storage system comprising:
-
a storage apparatus comprising a first storage unit, a second storage unit, a first control module, a block identifier generation module, and a first comparison module; and a host apparatus comprising a first cache, a second cache, a second control module, and a second comparison module, wherein; the first storage unit is configured to store block data items and block identifiers unique to the block data items such that the block data items and the block identifiers are associated with each other; the second storage unit is configured to store addresses of block data items and block identifiers unique to the block data items such that the addresses of the block data items and the block identifiers are associated with each other; the first control module is configured to process requests from the host apparatus; the block identifier generation module is configured to generate a block identifier unique to a block data item specified by the first control module, the first comparison module is configured to compare the block data item specified by the first control module with block data items stored in the first storage unit; the first cache is configured to store block data items and block identifiers unique to the block data items such that the block data items and the block identifiers are associated with each other; the second cache is configured to store addresses of block data items and block identifiers unique to the block data items such that the addresses of the block data items and the block identifiers are associated with each other, the second control module is configured to process requests generated at the host apparatus, and the second comparison module is configured to compare the block data item specified by the second control module with the block data items stored in the first cache, wherein the second control module is configured to (e1) specify for the second comparison module each of the block data items constituting data to be written into the storage apparatus when a write request for the writing has been generated at the host apparatus, the data to be written being processed in block data items; (e2) transmit a first-type write request including a first block data item and a first address of the first block data item to the storage apparatus when the result of comparison by the second comparison module based on the specification by the second control module has shown that the first block data item has coincided with none of the block data items stored in the first cache, the first block data item being included in the data to be written; (e3) transmit a second-type write request including a second block identifier, stored in the first cache such that the second block identifier is associated with any one of the block data items stored in the first cache, and a second address of the second block data item to the storage apparatus when the result of comparison by the second comparison module has shown that the second block data item has coincided with any one of the block data items stored in the first cache, the second block data item being included in the data to be written; (e4) store a first block identifier unique to the first block data item and the first block data item in the first cache such that the first block identifier and the first block data item are associated with each other and further store the first address and the first block identifier in the second cache such that the first address and the first block identifier are associated with each other when the storage apparatus has returned the first block identifier as a completion response to the first-type write request; and (e5) store the second address and the second block identifier in the second cache such that the second address and the second block identifier are associated with each other according to a completion response from the storage apparatus to the second-type write request, and wherein the control module is configured to (f1) specify the first block data item for the first comparison module when the first-type write request has been received by the storage apparatus; (f2) cause the block identifier generation module to generate the first block identifier unique to the first block data item when the result of comparison by the first comparison module based on the specification by the second control module has shown that the first block data item has coincided with none of the block data items stored in the first storage unit; (f3) store the first block identifier and the first block data item in the first storage unit such that the first block identifier and the first block data item are associated with each other; (f4) store the first address and the first block identifier in the second storage unit such that the first address and the first block identifier are associated with each other; (f5) transmit the first block identifier as a completion response to the first-type write request to the host apparatus after having stored the first block identifier and first block data item and the first address and first block identifier; (f6) store the second address and the second block identifier in the second storage unit such that the second address and the second block identifier are associated with each other when the storage apparatus has received the second-type write request; and (f7) transmit a completion response to the second-type write request to the host apparatus after having stored the second address and the second block identifier.
-
-
9. A method of eliminating duplicated data by a storage apparatus in cooperation with a host apparatus in a storage system comprising the storage apparatus and the host apparatus, the storage apparatus comprising a first storage unit, a second storage unit, a first comparison module, a block identifier generation module, and a first control module, and the host apparatus comprising a first cache, a second cache, a second comparison module, and a second control module, the method comprising:
-
causing the second comparison module to compare each of the block data items constituting data to be written into the storage apparatus with block data items stored in the first cache when a write request for the writing has been generated at the host apparatus, the data to be written into the storage apparatus being processed in block data items; causing the second control module to transmit a first-type write request including a first block data item and a first address of the first block data item to the storage apparatus when the result of comparison by the second comparison module has shown that the first block data item has coincided with none of the block data items stored in the first cache, the first block data item being included in the data to be written, causing the first comparison module to compare the first block data item with the block data items stored in the first storage unit when the storage apparatus has received the first-type write request; causing the block identifier generation module to generate a first block identifier unique to the first block data item when the result of comparison by the first comparison module has shown that the first block data item has coincided with none of the block data items stored in the first storage unit; causing the first control module to store the first block identifier and the first block data item in the first storage unit such that the first block identifier and the first block data item are associated with each other; causing the first control module to store the first address and the first block identifier in the second storage unit such that the first address and the first block identifier are associated with each other, causing the first control module to transmit the first block identifier as a completion response to the first-type write request to the host apparatus after the first control module has stored the first block identifier and first block data item and the first address and first block identifier; causing the second control module to store the first block identifier and the first block data item in the first cache such that the first block identifier and the first block data item are associated with each other when the host apparatus has received the first block identifier as the completion response to the first-type write request; causing the second control module to store the first address and the first block identifier in the second cache such that the first address and the first block identifier are associated with each other when the host apparatus has received the first block identifier as the completion response to the first-type write request; causing the second control module to transmit a second-type write request including a second block identifier stored in the first cache in such a manner that the second block identifier corresponds to any one of the block data items stored in the first cache and a second address of the second block data item to the storage apparatus when the result of comparison by the second comparison module has shown that the second block data item has coincided with any one of the block data items stored in the first cache, the second block data item being included in the data to be written; causing the first control module to store the second address and the second block identifier in the second storage unit such that the second address and the second block identifier are associated with each other when the storage apparatus has received the second-type write request; causing the first control module to transmit a completion response to the second-type write request to the host apparatus after the first control module has stored the second address and second block identifier; and causing the second control module to store the second address and the second block identifier in the second cache in such a manner that the former corresponds to the latter according to a completion response to the second-type write request. - View Dependent Claims (10, 11)
-
Specification