Determining file allocation based on file operations
First Claim
Patent Images
1. A system, comprising:
- a communication interface to receive logs from a plurality of nodes that store files in a cloud storage system, each of the logs indicating a file operation history for a respective file stored on a respective node of the plurality of nodes, the file operation history indicating operations performed on the respective file and which nodes requested the operations; and
an optimization engine including at least one processor to determine, based on the file operation histories indicated by the logs, a new allocation of the files across the nodes of the cloud storage system to reduce network traffic caused by operations performed on the files, wherein the determining of the new allocation of the files comprises, for each given file of the files;
assigning first weights to respective local operations in which the given file requested by a requesting node is stored at the requesting node;
assigning variable weights to respective network operations in which the given file requested by the requesting node is stored at another node;
aggregating the first weights and the variable weights to produce a total operation cost for operations on the given file; and
use the total operation cost to determine the new allocation of the files.
2 Assignments
0 Petitions
Accused Products
Abstract
A storage system may store files on multiple nodes. One or more logs may indicate operations performed on the files stored in the storage system and may identify the nodes that requested the operations. A new allocation or file placement scheme may be determined to reduce network traffic.
-
Citations
17 Claims
-
1. A system, comprising:
-
a communication interface to receive logs from a plurality of nodes that store files in a cloud storage system, each of the logs indicating a file operation history for a respective file stored on a respective node of the plurality of nodes, the file operation history indicating operations performed on the respective file and which nodes requested the operations; and an optimization engine including at least one processor to determine, based on the file operation histories indicated by the logs, a new allocation of the files across the nodes of the cloud storage system to reduce network traffic caused by operations performed on the files, wherein the determining of the new allocation of the files comprises, for each given file of the files; assigning first weights to respective local operations in which the given file requested by a requesting node is stored at the requesting node; assigning variable weights to respective network operations in which the given file requested by the requesting node is stored at another node; aggregating the first weights and the variable weights to produce a total operation cost for operations on the given file; and use the total operation cost to determine the new allocation of the files. - View Dependent Claims (2, 3, 4, 6, 7)
-
-
5. A system comprising:
-
a communication interface to receive logs from a plurality of nodes that store files in a cloud storage system, each of the logs indicating a file operation history for a respective file stored on a respective node of the plurality of nodes, the file operation history indicating operations performed on the respective file and which nodes requested the operations, wherein for a given file, a local operation is an operation performed on the given file by a node storing the file, and a network operation is an operation performed on the given file by a node different from the node storing the file; and an optimization engine including at least one processor to determine a new allocation of the files across the nodes of the cloud storage system to reduce network traffic caused by operations performed on the files, wherein the optimization engine is configured to determine the new allocation of the files by, for each file; assigning a fixed weight to each of local operations; assigning a variable weight to each of network operations based on distance traveled from a node requesting an operation to a node storing the file; calculating a total file operation cost by a cost function based on the fixed weights and the variable weights; determining whether storing the file on another node in the cloud storage system will reduce the total file operation cost given the file operation history of the file; and selecting a node that would provide a lower total file operation cost. - View Dependent Claims (8)
-
-
9. A method, comprising:
-
receiving, using a processor, multiple lists of file operations for files stored on nodes in a distributed file system; assigning, using a processor, a cost to each file operation based on whether the file operation is a local operation or a network operation, wherein the local operation is an operation on a given file of the files in which the given file is stored on a node that requested the given file, and wherein the network operation is an operation on the given file in which the given file is stored on a node different from the node that requested the given file, and wherein the cost assigned to the local operation is a fixed cost, and the cost assigned to the network operation is a variable cost dependent on a distance between a node storing the given file and the node that requested the given file; calculating, using a processor, a total operation cost for each respective file of the files by adding together the costs of local operations and the costs of network operations for the respective file; and for each respective file of the files, determining, using a processor, whether the respective total operation cost can be reduced by storing the respective file on another node in the distributed file system. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A non-transitory machine-readable storage medium encoded with instructions executable by at least one processor to:
-
read node operation logs associated with multiple nodes of a cloud storage system, each node operation log indicating a file operation history for each file stored on a node of the multiple nodes; calculate an operation cost for a respective file of a plurality of files, wherein the calculating comprises; aggregating costs of local file operations for the respective file and costs of network file operations for the respective file, wherein a local file operation is an operation requested by a node that stores the respective file, and a network file operation is an operation requested by a node different from the node that stores the respective file, and wherein each of the costs of local file operations is a fixed cost, and each of the costs of network file operations is a variable cost dependent on a distance between the node that stores the respective file and the node that requested the network file operation; and determine a different file allocation in the cloud storage system to reduce an operation cost for at least one of the plurality of files. - View Dependent Claims (16, 17)
-
Specification