DISTRIBUTED INDEXING SYSTEM FOR DATA STORAGE
First Claim
1. A method of storing index information describing secondary copies of data, the method comprising:
- receiving at a media agent data copied during a first data storage operation, wherein the media agent is configured to convey data between a computer and one or more data storage devices associated with the media agent;
indexing the copied data to determine content from the copied data, wherein indexing the copied data creates indexed data;
selecting both a primary index server and at least a secondary index server among multiple available index servers for storing the indexed data, wherein the multiple index servers are networked together and collectively provide a distributed index;
sending a reference to the indexed data to both the primary index server and to the secondary index server, wherein the first and second index servers use the reference to access the indexed data and update the distributed index maintained by both the primary and secondary index servers;
receiving at the primary index server a new update to the distributed index, wherein the new update is associated with a second data storage operation;
determining that the primary index server is not available; and
,updating the distributed index via the secondary index server based on the new update associated with the second data storage operation.
4 Assignments
0 Petitions
Accused Products
Abstract
A distributed indexing system spreads out the load on an index of stored data in a data storage system. Rather than maintain a single index, the distributed indexing system maintains an index in each media agent of a federated data storage system and a master index that points to the index in each media agent. In some embodiments, the distributed indexing system includes an index server (or group of servers) that handles indexing requests and forwards the requests to the appropriate distributed systems. Thus, the distributed indexing system, among other things, increases the availability and fault tolerance of a data storage index.
-
Citations
20 Claims
-
1. A method of storing index information describing secondary copies of data, the method comprising:
-
receiving at a media agent data copied during a first data storage operation, wherein the media agent is configured to convey data between a computer and one or more data storage devices associated with the media agent; indexing the copied data to determine content from the copied data, wherein indexing the copied data creates indexed data; selecting both a primary index server and at least a secondary index server among multiple available index servers for storing the indexed data, wherein the multiple index servers are networked together and collectively provide a distributed index; sending a reference to the indexed data to both the primary index server and to the secondary index server, wherein the first and second index servers use the reference to access the indexed data and update the distributed index maintained by both the primary and secondary index servers; receiving at the primary index server a new update to the distributed index, wherein the new update is associated with a second data storage operation; determining that the primary index server is not available; and
,updating the distributed index via the secondary index server based on the new update associated with the second data storage operation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A distributed index system for maintaining an index of data storage information, the system comprising:
-
one or more index server components configured to store index data for one or more index data sources, wherein the index server components form part of a federated data storage system having multiple, networked data storage components, wherein a storage manager controls at least portions of the federated data storage system, wherein each index server component creates or updates index data based on at least one scheduled storage policy provided by the storage manager, and wherein the storage policy is a set of preferences associated with performing a data storage operation; a main index component communicatively coupled to the index server components within the federated data storage system, wherein the main index component is configured to maintain a list of index server components and respond to requests for index data; and an index data replication component communicatively coupled to the index server components within the federated data storage system, wherein the index data replication component is configured to replicate data stored on each index server component to at least one other index server component. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A tangible computer-readable storage medium storing instructions for controlling a computer system to search for data objects in a data storage system using a distributed index, by a method comprising:
-
from a client or requesting computer, sending to a main index server a search request, wherein the search request specifies one or more criteria for identifying a data object to be restored from one of multiple data storage devices within a networked data storage system; receiving from the main index server a network address of a distributed index server that manages index data for the identified data object; from the client or requesting computer, sending to the distributed index server the search request; receiving from the distributed index server a response based on the index data managed by the distributed index server, wherein the response satisfies the search request, and wherein the response includes a network location of the one data storage device within the networked data storage system; and restoring the data object from the one data storage device based on the response, wherein the restoring of the data object includes restoring the data object to a computer other than the client or requesting computer. - View Dependent Claims (17, 18, 19, 20)
-
Specification