Data storage management using a distributed cache scheme
First Claim
1. A computer-implemented method for accessing data stored in a distributed storage system, the method comprising:
- maintaining a distributed storage system comprising a plurality of virtual machines executed on a plurality of computing systems connected over a network, wherein a portion of a respective data storage volume on each of the plurality of computing systems is allocated to one of the plurality of virtual machines executed on the respective computing system as a respective virtual memory,wherein upon addition of each of the plurality of computing systems to the network, determining whether there is sufficient free storage space on the respective computing system to be allocated to a distributed cache system implemented over the distributed storage system, the plurality of computing systems sharing free space in the virtual memories allocated to the plurality of virtual machines based on cache metadata identifying the amount of free storage space available on one or more of said plurality of computing systems;
determining, based on metadata associated with first data, whether a copy of the first data stored in one or more data storage volumes in a distributed storage system is stored in the distributed cache system implemented utilizing free storage space in said distributed storage system, in response to a first computing system receiving a request to access the first data stored in the one or more data storage volumes in the distributed storage system;
instead of accessing the first data stored in the one or more data storage volumes in the distributed storage system, accessing the copy of the first data from the distributed cache system, in response to determining that the copy of the first data is stored in a first data storage medium in a first cache locally associated with the first computing systeminstead of accessing the first data stored in the one or more data storage volumes in the distributed storage system, requesting a second computing system, other than the first computing system, in the network to access the copy of the first data from the distributed cache system, in response to determining that the copy of the first data is stored not in the first data storage medium in the first cache, but in a second data storage medium in a second cache locally associated with the second computing system; and
accessing the first data from the one or more data storage volumes in the distributed storage system, in response to determining that the copy of the first data is not stored in the distributed cache system, wherein the distributed cache system comprises portions of the first data storage medium in the first cache and portions of the second data storage medium in the second cache utilized for caching data stored in the distributed storage system,wherein the cache metadata further comprises a mapping between the copy of the first data stored in the distributed cache system and the first data stored in the one or more data storage volumes in the distributed storage system, wherein the metadata is propagated among one or more computing systems supporting the distributed cache system to enable the first computing system to determine storage location of the first data in both the distributed cache system and the one or more data storage volumes in the distributed storage system, and to provide for the first data stored in the one or more data storage volumes in the distributed storage system to be updated when the copy of the first data stored in the distributed cache system is updated.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for accessing data stored in a distributed storage system is provided. The method comprises determining whether a copy of first data is stored in a distributed cache system, where data in the distributed cache system is stored in free storage space of the distributed storage system; accessing the copy of the first data from the distributed cache system if the copy of the first data is stored in a first data storage medium at a first computing system in a network; and requesting a second computing system in the network to access the copy of the first data from the distributed cache system if the copy of the first data is stored in a second data storage medium at the second computing system. If the copy of the first data is not stored in the distributed cache system, the first data is accessed from the distributed storage system.
24 Citations
24 Claims
-
1. A computer-implemented method for accessing data stored in a distributed storage system, the method comprising:
-
maintaining a distributed storage system comprising a plurality of virtual machines executed on a plurality of computing systems connected over a network, wherein a portion of a respective data storage volume on each of the plurality of computing systems is allocated to one of the plurality of virtual machines executed on the respective computing system as a respective virtual memory, wherein upon addition of each of the plurality of computing systems to the network, determining whether there is sufficient free storage space on the respective computing system to be allocated to a distributed cache system implemented over the distributed storage system, the plurality of computing systems sharing free space in the virtual memories allocated to the plurality of virtual machines based on cache metadata identifying the amount of free storage space available on one or more of said plurality of computing systems; determining, based on metadata associated with first data, whether a copy of the first data stored in one or more data storage volumes in a distributed storage system is stored in the distributed cache system implemented utilizing free storage space in said distributed storage system, in response to a first computing system receiving a request to access the first data stored in the one or more data storage volumes in the distributed storage system; instead of accessing the first data stored in the one or more data storage volumes in the distributed storage system, accessing the copy of the first data from the distributed cache system, in response to determining that the copy of the first data is stored in a first data storage medium in a first cache locally associated with the first computing system instead of accessing the first data stored in the one or more data storage volumes in the distributed storage system, requesting a second computing system, other than the first computing system, in the network to access the copy of the first data from the distributed cache system, in response to determining that the copy of the first data is stored not in the first data storage medium in the first cache, but in a second data storage medium in a second cache locally associated with the second computing system; and accessing the first data from the one or more data storage volumes in the distributed storage system, in response to determining that the copy of the first data is not stored in the distributed cache system, wherein the distributed cache system comprises portions of the first data storage medium in the first cache and portions of the second data storage medium in the second cache utilized for caching data stored in the distributed storage system, wherein the cache metadata further comprises a mapping between the copy of the first data stored in the distributed cache system and the first data stored in the one or more data storage volumes in the distributed storage system, wherein the metadata is propagated among one or more computing systems supporting the distributed cache system to enable the first computing system to determine storage location of the first data in both the distributed cache system and the one or more data storage volumes in the distributed storage system, and to provide for the first data stored in the one or more data storage volumes in the distributed storage system to be updated when the copy of the first data stored in the distributed cache system is updated. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer-implemented method for accessing data stored in a distributed storage system, the method comprising:
-
maintaining a distributed storage system comprising a plurality of virtual machines executed on a plurality of computing systems connected over a network, wherein a portion of a respective data storage volume on each of the plurality of computing systems is allocated to one of the plurality of virtual machines executed on the respective computing system as a respective virtual memory, wherein upon addition of each of the plurality of computing systems to the network, determining whether there is sufficient free storage space on the respective computing system to be allocated to a distributed cache system implemented over the distributed storage system, the plurality of computing systems sharing free space in the virtual memories allocated to the plurality of virtual machines based on cache metadata identifying the amount of free storage space available on one or more of said plurality of computing systems; determining, based on metadata associated with first data, whether a copy of first data in one or more data storage volumes in a distributed storage system is stored in a distributed cache system implemented utilizing free storage space in said distributed storage system, in response to a first virtual machine (VM) receiving a request to access the first data stored in the one or more data storage volumes in the distributed storage system; instead of accessing the first data in the one or more data storage volumes in the distributed storage system, accessing the copy of the first data from the distributed cache system, in response to determining that the copy of the first data is stored in a first virtual memory associated with the first VM in the network; instead of accessing the first data in the one or more data storage volumes in the distributed storage system, requesting a second VM in the network to access the copy of the first data from the distributed cache system, in response to determining that the copy of the first data is stored not in the first virtual memory, but in a second virtual memory associated with the second VM; and accessing the first data from the one or more data storage volumes in the distributed storage system, in response to determining that the copy of the first data is not stored in the distributed cache system wherein the distributed cache system comprises portions of the first virtual memory and portions of the second virtual memory utilized for caching data stored in the distributed storage system, wherein the cache metadata wherein the distributed cache system further comprises a mapping between the copy of the first data stored in the distributed cache system and the first data stored in the one or more data storage volumes in the distributed storage system, wherein the metadata is propagated among one or more VMs supporting the distributed cache system to enable the first VM to determine storage location of the first data in both the distributed cache system and the one or more data storage volumes in the distributed storage system, and to provide for the first data stored in the one or more data storage volumes in the distributed storage system to be updated when the copy of the first data stored in the distributed cache system is updated. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A computer program product comprising a non-transitory data storage medium having logic code stored thereon, wherein the logic code when executed on a computer causes the computer to:
-
maintain a distributed storage system comprising a plurality of virtual machines executed on a plurality of computing systems connected over a network, wherein a portion of a respective data storage volume on each of the plurality of computing systems is allocated to one of the plurality of virtual machines executed on the respective computing system as a respective virtual memory, wherein upon addition of each of the plurality of computing systems to the network, it is determined whether there is sufficient free storage space on the respective computing system to be allocated to a distributed cache system implemented over the distributed storage system, the plurality of computing systems sharing free space in the virtual memories allocated to the plurality of virtual machines based on cache metadata identifying the amount of free storage space available on one or more of said plurality of computing systems; determine, based on metadata associated with first data, whether a copy of first data in one or more data storage volumes in a distributed storage system is stored in a distributed cache system implemented utilizing free storage space in said distributed storage system, in response to a first computing system receiving a request to access the first data stored in the distributed storage system; instead of accessing the first data in the one or more data storage volumes in the distributed storage system, access the copy of the first data from the distributed cache system, in response to determining that the copy of the first data is stored in a first data storage medium designated for or locally connected to the first computing system; instead of accessing the first data in the one or more data storage volumes in the distributed storage system, request a second computing system in the network to access the copy of the first data from the distributed cache system, in response to determining that the copy of the first data is stored not in the first data storage medium, but in a second data storage medium designated for or locally connected to the second computing system; and access the first data from the one or more data storage volumes in the distributed storage system, in response to determining that the copy of the first data is not stored in the distributed cache system, wherein the distributed cache system comprises portions of the first data storage medium and portions of the second data storage medium utilized for caching data stored in the distributed storage system, wherein the distributed cache system further comprises metadata providing a one-to-one mapping between the copy of the first data stored in the distributed cache system and the first data stored in the one or more data storage volumes in the distributed storage system to enable the first data stored in the distributed storage system to be updated when the copy of the first data stored in the distributed cache system is updated, wherein the cache metadata further comprises a mapping between the copy of the first data stored in the distributed cache system and the first data stored in the one or more data storage volumes in the distributed storage system, wherein the metadata is propagated among one or more computing systems supporting the distributed cache system to enable the first computing system to determine storage location of the first data in both the distributed cache system and the one or more data storage volumes in the distributed storage system, and to provide for the first data stored in the one or more data storage volumes in the distributed storage system to be updated when the copy of the first data stored in the distributed cache system is updated.
-
Specification