System and methods for metadata management in content addressable storage
First Claim
Patent Images
1. A computer-implemented method for managing metadata in a content addressable storage system, the method comprising:
- receiving, using one or more computer processors, a file for storage at a first content addressable storage (CAS) server, the file comprising a header and data, and wherein the first CAS server stores data that can be retrieved based on content of the data rather than its storage location or with a hierarchical file system;
receiving, using one or more computer processors, the same one or more files for storage at a second CAS server;
automatically obtaining, with the one or more computer processors, from the header of the file, metadata associated with the data;
storing the metadata in a first metadata storage device, wherein the metadata is stored in association with the data stored in the CAS server;
replicating the stored metadata and storing the replicated metadata in a second metadata storage device;
receiving, using the one or more computer processors, a query from a requester for content at the CAS server;
performing a local search within locally-stored content related to the received query;
sending the query to one or more CAS servers;
searching beyond a temporary data cache in a local storage device for local content not stored in the CAS server and related to the received query, wherein the local storage device and the CAS server are distinct;
sending results of the local search to the requestor;
searching the metadata storage device for content related to the received query; and
when the metadata associated with the file is indicated by the query;
retrieving the file stored in the content addressable storage; and
sending the retrieved file to the requester;
wherein sending the results of the local search and the retrieved file to the requester further comprises excluding or flagging any duplicate files.
0 Assignments
0 Petitions
Accused Products
Abstract
Provided is a content addressable storage (CAS) system that allows a user to request, either through an application server or directly to one or more CAS servers, files and content related to a query. In some embodiments, the content can be discovered by searching previously-stored metadata related to each file at the content addressable storage server. The search can also be replicated across multiple content addressable storage servers in order to obtain varied results and redundant results. Duplicate results may be flagged or omitted, and the results are returned to the requester.
-
Citations
23 Claims
-
1. A computer-implemented method for managing metadata in a content addressable storage system, the method comprising:
-
receiving, using one or more computer processors, a file for storage at a first content addressable storage (CAS) server, the file comprising a header and data, and wherein the first CAS server stores data that can be retrieved based on content of the data rather than its storage location or with a hierarchical file system; receiving, using one or more computer processors, the same one or more files for storage at a second CAS server; automatically obtaining, with the one or more computer processors, from the header of the file, metadata associated with the data; storing the metadata in a first metadata storage device, wherein the metadata is stored in association with the data stored in the CAS server; replicating the stored metadata and storing the replicated metadata in a second metadata storage device; receiving, using the one or more computer processors, a query from a requester for content at the CAS server; performing a local search within locally-stored content related to the received query; sending the query to one or more CAS servers; searching beyond a temporary data cache in a local storage device for local content not stored in the CAS server and related to the received query, wherein the local storage device and the CAS server are distinct; sending results of the local search to the requestor; searching the metadata storage device for content related to the received query; and when the metadata associated with the file is indicated by the query; retrieving the file stored in the content addressable storage; and sending the retrieved file to the requester; wherein sending the results of the local search and the retrieved file to the requester further comprises excluding or flagging any duplicate files. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-implemented method for managing metadata in a hashed storage system, comprising:
-
receiving, using one or more computer processors, one or more files for storage at a first hashed storage server, wherein each of the one or more files comprises a header and data, and wherein the hashed storage server stores and retrieves the data with a hash function that generates unique identifiers linked to content of the data rather than with a location-based, hierarchical file system; receiving, using one or more computer processors, the same one or more files for storage at a second hashed storage server; automatically obtaining with the one or more computer processors metadata associated with the data from the header of each of the one or more files; storing the metadata in a first metadata storage device, wherein the metadata is stored in association with the data stored in the hashed storage server; replicating the stored metadata and storing the replicated metadata in a second metadata storage devices; receiving, using the one or more computer processors, a first query from a requester for content at an application server, wherein the application server comprises a local storage device, the local storage device and the content addressable storage server are distinct; after or simultaneously with the local search according to the first query, sending a second query, related to the first query, to the hashed storage server; receiving one or more files related to the second query from the hashed storage server; excluding or flagging any duplicate files resulting from the first query, the second query, or both, the duplicate files comprising the same one or more files; and sending to the requester a result set comprising the one or more files found at the local storage device based on the first query and the one or more files received from the hashed storage server based on the second query, wherein any duplicate files are excluded or flagged in the result set. - View Dependent Claims (10)
-
-
11. A computer-implemented system for managing metadata in a content addressable storage system comprising:
-
a content addressable storage system comprising at least one computer processor configured to; receive a file for storage and backup, said file to be stored using a first content addressable storage server and also backed up to a second content addressable storage server; store metadata associated with the file in a first storage mechanism for storing metadata for content addressable storage; replicate the stored metadata in a second storage mechanism; receive a query from a requester for content; search an application server for local content that is related to the received query but that is not stored in the content addressable storage system, wherein the application server and the content addressable storage system are distinct; retrieve the local content stored on the application server; send the local content to the requestor; search the metadata storage mechanism for content related to the received query; and when the metadata associated with the file is indicated by the query; retrieve the associated file stored in the content addressable storage; and send the retrieved file to the requester; wherein sending the local content to the requester and sending the retrieved file to the requester comprise excluding or flagging any duplicate files. - View Dependent Claims (12, 13)
-
-
14. A computer-implemented system for managing metadata in a content addressable storage (CAS) system, comprising:
-
a CAS system comprising at least one computer processor configured to; receive a file for storage and receive a second copy of the same file, said file and the second copy of the same file to be stored using content addressable storage; and store metadata associated with the file and the second copy of the file in a searchable storage mechanism for storing metadata for CAS; and an application server comprising at least one computer processor, the application server configured to; receive a first query from a requester for content at an application server, wherein the application server comprises a local storage device and wherein the local storage device and the CAS system are distinct; send a second query, related to the first query, to the CAS system; receive one or more files related to the second query from the CAS system; and send a result set to the requester, the result set comprising one or more files found locally based on the first query and the one or more files received from the CAS system based on the second query, wherein any duplicate files are excluded or flagged in the result set. - View Dependent Claims (15, 16)
-
-
17. A computer-implemented method comprising:
-
receiving, using one or more computer processors, a file for storage at a first fixed content storage CAS server, the file comprising a header and data; receiving, using one or more computer processors, the same file for storage at a second CAS server; automatically determining that the file meets the criteria for storing the file at a CAS server and storing the file at one or more CAS servers, wherein the criteria does not prevent or exclude duplicate or backup files; automatically obtaining, with the one or more computer processors, from the header of the file, metadata associated with the file; storing the metadata in at least one searchable metadata storage device at one or more CAS servers; receiving, at a CAS server, using the one or more computer processors, a query from a requester for content at a CAS server; forwarding the query, using the one or more computer processors, to at least one additional CAS server; simultaneously with or after forwarding the query to the at least one additional CAS server, searching a local storage device for local content that is related to the query; sending the local content to the requestor, wherein sending the local content comprises excluding or flagging any duplicate content; searching the metadata storage device for content related to the received query; and when the metadata associated with the file is indicated by the query; retrieving the associated file stored in the CAS server; and sending the retrieved file to the requester, wherein sending the retrieved file comprises excluding or flagging any duplicates. - View Dependent Claims (18, 19)
-
-
20. A computer-implemented system for managing fixed content storage, the system comprising:
-
a CAS server having at least one computer processor, the CAS server configured to; receive files for storage and automatically recognize files as eligible for fixed content storage, wherein duplicate files are eligible for storage; and store and retrieve files, including duplicate files, using a function that is independent of physical storage location and that maintains an identifier that is consistent for each file as long as the data comprising that file does not change; an application server having at least one computer processor, the application server configured to; receive files, including duplicate files, for local storage outside of the local cache, in non-temporary storage; receive a query from a requester and convey the query to the fixed content storage server; after or at the same time as conveying the inquiry, perform a search, based on the query, within its own local, non-temporary storage; and retrieve any relevant local content or files and convey them to the requestor, while at the same time flagging or excluding duplicates; and a metadata storage device associated with either the FCS server or the application server, the metadata storage device configured to; store metadata associated with the file, and any duplicate files, in a searchable metadata database; associate metadata with files stored on the CAS server or the application server; and using these associations, allow the CAS server or the application server to search for content or files related to any received query, and when metadata associated with a file is indicated by the query, retrieve the relevant file or files, whether stored in the CAS server or in local storage and send any retrieved files to the requester, excluding or flagging duplicates. - View Dependent Claims (21, 22, 23)
-
Specification