Distributed network data storage system and method
First Claim
1. A system for distributed file storage comprising:
- a plurality of servers providing, to a plurality of clients, file access services for accessing files stored on the plurality of servers; and
a dynamic list of neighbor servers maintained by each server, wherein the neighbor servers are grouped as a subset of the plurality of servers, switching at least one server of the plurality of servers into a neighbor group of servers based on network distance,wherein each file is stored in the form of a plurality of N pieces on N servers, the N pieces being generated from the file,wherein the list is used to obtain information for reconstructing files stored on the neighbor servers, such that any K out of the N pieces can be used to reconstruct any file,wherein a server belonging to more than one group acts as a boundary server, andwherein boundary servers are used to transfer pieces of the file to servers of groups other than a group to which a client has connected.
10 Assignments
0 Petitions
Accused Products
Abstract
The present invention is a system and method for distributed, highly scalable, wide area peer-to-peer network data storage. The functionally equivalent servers in the system are divided into groups. Each server maintains a dynamic list which is polled to determine the availability of the closest neighbor servers. Each server is switched between the groups of servers to optimize network connectivity parameters. Data and directory files are divided into a plurality of pieces which are stored on different servers. Files are uniformly and independently named, utilizing a tree with a common root, logical pathways, and unique file identifiers. When a server receives a client request for file system access, the plurality of file pieces are collected and sent to the client server from the neighbor servers simultaneously in order to optimize bandwidth. Servers with maximum throughput capacity are utilized for highest transmission speed and reduced processing time.
85 Citations
44 Claims
-
1. A system for distributed file storage comprising:
-
a plurality of servers providing, to a plurality of clients, file access services for accessing files stored on the plurality of servers; and a dynamic list of neighbor servers maintained by each server, wherein the neighbor servers are grouped as a subset of the plurality of servers, switching at least one server of the plurality of servers into a neighbor group of servers based on network distance, wherein each file is stored in the form of a plurality of N pieces on N servers, the N pieces being generated from the file, wherein the list is used to obtain information for reconstructing files stored on the neighbor servers, such that any K out of the N pieces can be used to reconstruct any file, wherein a server belonging to more than one group acts as a boundary server, and wherein boundary servers are used to transfer pieces of the file to servers of groups other than a group to which a client has connected. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for distributed file storage comprising:
-
dividing a plurality of servers into a plurality of groups, with each server belonging to at least one group; on each server, maintaining a dynamic list of neighbor servers belonging to the same group, wherein each group is a subset of the plurality of servers;
switching at least one server of the plurality of servers into a neighbor group of servers based on network distance;supporting file access services on each of the servers; transforming a file into a plurality of N pieces that are derived from the file, such that any K out of the N pieces can be used to reconstruct any file; and storing each of the pieces on N servers selected from the list, wherein a server belonging to more than one group acts as a boundary server, and wherein boundary servers are used to transfer pieces of the file to servers of groups other than a group to which a client has connected. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A method of accessing files in a distributed file storage system comprising:
-
dividing a plurality of the servers into a plurality of groups, wherein each server belongs to at least one group, and wherein each group is a subset of the plurality of servers; supporting file access services on each of the servers for accessing a file stored on the servers; at each server, maintaining a dynamic list of neighbor servers that belong to the same group;
switching at least one server of the plurality of servers into a neighbor group of servers based on network distance;generating a plurality of N pieces from the file; and distributing the plurality of N pieces to the neighbor servers in the same group in order to achieve a desired fault tolerance level, wherein the fault tolerance level is defined by how many servers (K) out of the total number of N servers on which the N file pieces are stored, can fail, such that any K out of the N pieces can be used to reconstruct any file, wherein a server belonging to more than one group acts as a boundary server, and wherein boundary servers are used to transfer pieces of the file to servers of groups other than a group to which a client has connected. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A method of naming files in a distributed file storage system comprising:
-
dividing a plurality of servers into a plurality of groups such that each server belongs to at least one group, wherein each group is a subset of the plurality of servers; supporting file access services on each of the servers for accessing files stored on the servers;
at each server, maintaining a dynamic list of neighbors server that belong to the same group;
switching at least one server of the plurality of servers into a neighbor group of servers based on network distance;giving file names for the uniformly and independent of location of the files on the servers; storing the files on N of the servers using the names, wherein each file is transformed into a plurality of pieces that are generated from the file, such that any K out of the N pieces can be used to reconstruct any file, wherein a server belonging to more than one group acts as a boundary server, and wherein boundary servers are used to transfer pieces of the file to servers of groups other than a group to which a client has connected; and accessing the files using the file access services from any of servers. - View Dependent Claims (32, 33, 34, 35, 36, 37, 38)
-
-
39. A system for organizing distributed file storage comprising:
-
N functionally equivalent servers each providing file access services, for a plurality of clients, to files stored on the servers, such that when a file is divided into N pieces stored on the N servers, any K out of the N servers can be used to reconstruct the file; and each file being transformed into the N pieces that are generated from the file and stored on the N servers, at each server, maintaining a dynamic list of neighbors server that belong to the same group;
switching at least one server of the plurality of servers into a neighbor group of server based on network distance,wherein information for reconstructing the files is obtained from the servers, wherein the servers are organized into groups, each group being a subset of the plurality of N servers, and wherein a server belonging to more than one group acts as a boundary server, and wherein boundary servers are used to transfer pieces of the file to servers of groups other than a group to which a client has connected. - View Dependent Claims (40, 41)
-
-
42. A computer program product for distributed file storage, the computer program product comprising a computer useable medium having computer program logic recorded thereon for controlling a processor, the computer program logic comprising:
-
computer program code means for organizing a plurality of servers into a plurality of groups, with each server belonging to at least one group; on each server, computer program code means for maintaining a dynamic list of neighbor servers belonging to the same group;
switching at least one server of the plurality of servers into a neighbor group of servers based on network distance;computer program code means for supporting file access services on each of the servers; computer program code means for transforming a file into a plurality of pieces that are derived from the file, such that any K out of the N pieces can be used to reconstruct any file; and computer program code means for storing each of the pieces on N servers selected from the lists, wherein a server belonging to more than one group acts as a boundary server, and wherein boundary servers are used to transfer pieces of the file to servers of groups other than a group to which a client has connected.
-
-
43. A method of accessing files in a distributed file storage system comprising:
-
organizing a plurality of the servers into dynamically reconfigurable groups, wherein each server belongs to at least one group that is reconfigurable based on minimum network distance, wherein each group is a subset of the plurality of servers;
at each server, maintaining a dynamic list of neighbors server that belong to the same group;
switching at least one server of the plurality of servers into a neighbor group of servers based on network distance;supporting file access services on each of the servers for accessing a file distributed among the servers; generating a plurality of N pieces from the file; and distributing the plurality of N pieces to N of the neighbor servers in the same group in order to achieve a desired fault tolerance level based on how many servers (K) out of the plurality of servers are available, such that any K out of the N pieces can be used to reconstruct any file, wherein a server belonging to more than one group acts as a boundary server, and wherein boundary servers are used to transfer pieces of the file to servers of groups other than a group to which a client has connected.
-
-
44. A system for distributed file storage comprising:
-
a plurality of servers providing, to a plurality of clients, file access services for accessing files stored on the plurality of servers; wherein the servers are organized into a plurality of groups such that server-to-server response time between any two servers of the same group does not exceed a predetermined limit; wherein a server belonging to more than one group acts as a boundary server; wherein, when a new server is connected to one of the groups, the new server uses information of the boundary servers to connect to a group so as to have optimal server-to-server response time to its neighbors within its group; wherein, when one server is disconnected from the system, the remaining servers use information of the boundary servers to reconfigure their groups so as to have optimal server-to-server response times to their neighbors within their groups; and a dynamic list of neighbor servers of each server'"'"'s group maintained by each server, wherein the neighbor servers are a subset of the plurality of servers, switching at least one server of the plurality of servers into a neighbor group of servers based on network distance; wherein each file is stored in the form of a plurality of N pieces, the pieces being generated from the file and stored on N servers of the plurality of servers, wherein boundary servers are used to transfer pieces of the file to servers of groups other than a group to which a client has connected, wherein the list is used to obtain information for reconstructing files stored on the neighbor servers, and wherein any client can use any of the plurality of servers to access any of its files stored on the system.
-
Specification