Distributed indexing system for data storage
First Claim
1. A method of storing distributed index information describing secondary copies of data, the method comprising:
- copying primary data stored in one or more primary storage devices to one or more secondary storage devices, wherein copying the primary data to the secondary storage device creates a secondary copy on the one or more secondary storage devices, the primary data generated by one or more software applications running on a client computer;
indexing a first portion of the secondary copy stored on the one or more secondary storage devices, wherein indexing the first portion of the secondary copy creates a first index of the first portion of secondary data;
storing the first index in association with a first index server;
indexing a second portion of the secondary copy stored on the one or more secondary storage devices, wherein indexing the second portion of secondary copy creates a second index of the second portion of secondary data;
storing the second index in association with a second index server;
wherein the first and second index servers are networked together and collectively provide a distributed index comprising the first and second indexes;
copying the first index associated with the first index server to the second index server so that the first index is available at the first and second index servers;
identifying that a plurality of requests for secondary data are associated with the first index;
distributing the plurality of requests for the secondary data associated with the first index among the first and second index servers to lessen a load on the first index server;
performing a storage operation that copies new primary data stored in the one or more primary storage devices to the one or more secondary storage devices;
generating an index update associated with copying the new primary data stored in the one or more primary storage devices to the one or more secondary storage devices;
determining that the first index server is not available; and
updating the distributed index by copying the index update to the second index server.
4 Assignments
0 Petitions
Accused Products
Abstract
A distributed indexing system spreads out the load on an index of stored data in a data storage system. Rather than maintain a single index, the distributed indexing system maintains an index in each media agent of a federated data storage system and a master index that points to the index in each media agent. In some embodiments, the distributed indexing system includes an index server (or group of servers) that handles indexing requests and forwards the requests to the appropriate distributed systems. Thus, the distributed indexing system, among other things, increases the availability and fault tolerance of a data storage index.
-
Citations
18 Claims
-
1. A method of storing distributed index information describing secondary copies of data, the method comprising:
-
copying primary data stored in one or more primary storage devices to one or more secondary storage devices, wherein copying the primary data to the secondary storage device creates a secondary copy on the one or more secondary storage devices, the primary data generated by one or more software applications running on a client computer; indexing a first portion of the secondary copy stored on the one or more secondary storage devices, wherein indexing the first portion of the secondary copy creates a first index of the first portion of secondary data; storing the first index in association with a first index server; indexing a second portion of the secondary copy stored on the one or more secondary storage devices, wherein indexing the second portion of secondary copy creates a second index of the second portion of secondary data; storing the second index in association with a second index server; wherein the first and second index servers are networked together and collectively provide a distributed index comprising the first and second indexes; copying the first index associated with the first index server to the second index server so that the first index is available at the first and second index servers; identifying that a plurality of requests for secondary data are associated with the first index; distributing the plurality of requests for the secondary data associated with the first index among the first and second index servers to lessen a load on the first index server; performing a storage operation that copies new primary data stored in the one or more primary storage devices to the one or more secondary storage devices; generating an index update associated with copying the new primary data stored in the one or more primary storage devices to the one or more secondary storage devices; determining that the first index server is not available; and updating the distributed index by copying the index update to the second index server. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A distributed index system for maintaining an index of data storage information related to secondary copies of data, the comprising:
-
a storage manager comprising computer hardware, the storage manager configured to copy primary data stored in one or more primary storage devices to one or more secondary storage devices, wherein copying the primary data to the secondary storage device creates a secondary copy on the one or more secondary storage devices, the primary data generated by one or more software applications running on a client computer; a first index server comprising computer hardware, the first index server configured to index a first portion of the secondary copy stored on the one or more secondary storage devices, wherein indexing the first portion of the secondary copy creates a first index of the first portion of secondary data, and wherein the first index is stored in association with a first index server; a second index server comprising computer hardware having one or more computer processors, the second index server configured to index a second portion of the secondary copy stored on the one or more secondary storage devices, wherein indexing the second portion of secondary copy creates a second index of the second portion of secondary data, and wherein the second index is stored in association with a second index server; wherein the first and second index servers are networked together and collectively provide a distributed index comprising the first and second indexes; an index replication component comprising computer hardware, the index replication component configured to direct the copying the first index associated with the first index server to the second index server so that the first index is available at the first and second index servers; a third index server comprising computer hardware that is configured to identify that a plurality of requests for the secondary data are associated with the first index, the third index server further configured to distribute the plurality of requests for the secondary data associated with the first index among the first and second index servers to lessen a load on the first index server; and wherein the storage manager is configured to initiate a storage operation that copies new primary data stored in the one or more primary storage devices to the one or more secondary storage devices; a secondary storage computing device that is configured to generate an index update associated with copying the new primary data stored in the one or more primary storage devices to the one or more secondary storage devices, and determine that the first index server is not available, wherein the secondary storage computing device is further configured to update the distributed index by copying the index update to the second index server. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification