Method and system for offline indexing of content and classifying stored data
First Claim
Patent Images
1. In a data management system residing within a private computer network, a method for indexing content, comprising:
- identifying a production copy having one or more production data files each having keywords and metadata,wherein the production copy is available from a production data server within the private computer network;
identifying an offline copy from the private computer network,wherein the offline copy includes one or more offline data files each having keywords and metadata, andwherein the offline data files are copies of the one or more production data files, andwherein the offline copy of the one or more offline data files is stored in one or more secondary storage devices;
restoring the identified offline copy to an intermediate server,wherein the intermediate server is different from the production data server andwherein the intermediate server has a higher availability than the secondary storage devices;
identifying keywords from the restored offline copy on the intermediate server,wherein the identifying of the keywords is performed without use of the production data server and without accessing the production copy;
creating a content index of the identified keywords on the intermediate server after the identifying of the keywords,wherein the content index classifies the identified keywords based on at least one or more user-defined classifications,wherein the user-defined classifications include administratively defined groups within an organization or organization departments,wherein the content index is in an unencrypted form even when the offline copy is encrypted, andwherein the creating of the content index is performed without affecting the production data server; and
updating the content index by associating the offline data files with the production data files.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and system for creating an index of content without interfering with the source of the content includes an offline content indexing system that creates an index of content from an offline copy of data. The system may associate additional properties or tags with data that are not part of traditional indexing of content, such as the time the content was last available or user attributes associated with the content. Users can search the created index to locate content that is no longer available or based on the associate attributes.
314 Citations
20 Claims
-
1. In a data management system residing within a private computer network, a method for indexing content, comprising:
-
identifying a production copy having one or more production data files each having keywords and metadata, wherein the production copy is available from a production data server within the private computer network; identifying an offline copy from the private computer network, wherein the offline copy includes one or more offline data files each having keywords and metadata, and wherein the offline data files are copies of the one or more production data files, and wherein the offline copy of the one or more offline data files is stored in one or more secondary storage devices; restoring the identified offline copy to an intermediate server, wherein the intermediate server is different from the production data server and wherein the intermediate server has a higher availability than the secondary storage devices; identifying keywords from the restored offline copy on the intermediate server, wherein the identifying of the keywords is performed without use of the production data server and without accessing the production copy; creating a content index of the identified keywords on the intermediate server after the identifying of the keywords, wherein the content index classifies the identified keywords based on at least one or more user-defined classifications, wherein the user-defined classifications include administratively defined groups within an organization or organization departments, wherein the content index is in an unencrypted form even when the offline copy is encrypted, and wherein the creating of the content index is performed without affecting the production data server; and updating the content index by associating the offline data files with the production data files. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 13, 15, 16, 18, 19)
-
-
9. A computer system for indexing and searching content of one or more data files, wherein the data files include keywords and metadata, and wherein the computer system is coupled to one or more secondary storage devices, the computer system comprising:
-
a memory having instructions; a processor coupled to the memory to execute the instructions, wherein the instructions include; a production component configured to manage one or more production data files in the memory wherein each of the production data files includes keywords and metadata; an offline copy component configured to identify an offline copy of the one or more production data files after the offline copy is stored in the one or more secondary storage devices, wherein the offline copy contains one or more offline data files, and wherein the offline copy is distinguishable from a source of the one or more production data files; a restoring component configured to restore the identified offline copy to an intermediate server, wherein the intermediate server is different from the source of the one or more production data files and wherein the intermediate server has a higher availability than the secondary storage devices; an indexing component configured to create an index of keywords from the restored offline copy on the intermediate server after the offline copy component identifies the offline copy, wherein the index contains classifications of the one or more offline data files having the keywords; wherein the classifications include a level of confidentiality for the one or more offline data files having the keywords, wherein the index is in an unencrypted form even when the offline copy is encrypted, and wherein the index of the keywords is created without consuming additional resources of a system that is the source of the one or more production data files; and an index searching component configured to select certain indexed keywords based on a received search query and the classifications contained within the index. - View Dependent Claims (10, 11, 12, 14, 17)
-
-
20. A non-transitory computer-readable medium storing instructions, which when executed by at least one data processor, indexes and searches content of one or more data files, wherein the data files include keywords and metadata, comprising:
-
identifying an offline copy of one or more production data files after the offline copy is stored in one or more secondary storage devices, wherein the offline copy contains one or more offline data files, wherein the offline copy is distinguishable from a source of the one or more production data files, and wherein each of the offline data files has keywords and metadata; restoring the identified offline copy to an intermediate server, wherein the intermediate server is different from the source of the one or more production data files and wherein the intermediate server has a higher availability than the secondary storage devices; creating an index of keywords from the restored offline copy on the intermediate server after the offline copy is identified, wherein the index contains classifications of the one or more offline data files having the keywords, wherein the classifications include a level of confidentiality for the one or more offline data files having the keywords, wherein the index is in an unencrypted form even when the offline copy is encrypted, and wherein the index of the keywords is created without consuming additional resources of a system that is the source of the one or more production data files; and selecting certain indexed keywords based on a received search query and the classifications contained within the index.
-
Specification