System and method for management of retention periods for content in a computing system
First Claim
1. A method for setting and managing a retention periods for documents in a distributed computing system, said method comprising the steps of:
- loading compound documents into a database as individual items;
extracting text from one or more of the individual items and linking the text to a source item;
establishing a custodian for each document instance from the one or more documents;
categorizing content within the one or more documents into one or more categories and associating the one or more documents with the one or more categories;
assigning retention policies to the content based on the one or more categories, wherein the policies specify a retention period; and
continuously monitoring retention policies according to assigned retention periods;
utilizing algorithms to cross-match one or more retention policies with the each document instance based on the role of the custodian held at the time a document instance was created and retained and the category to which the document instance refers,in the event multiple retention policies apply for the each document instance, setting a principle retention period to the furthest in the future; and
monitoring the principle retention period for said document instance by placing document instances that are expired or about to expire in a queue for deletion, modification, or suspension, and removing from the queue any document instance having a modified or suspended retention period and deleting any remaining document instance in the queue.
20 Assignments
0 Petitions
Accused Products
Abstract
Disclosed are embodiments of a system and method for establishing and managing document retention policy in a distributed computing environment. One embodiment comprises the steps of loading documents into a database as individual components or items; establishing a custodian for each document instance; categorizing its content into a plurality of categories; assigning retention periods to the content by category; and continuously monitoring retention policies according to the assigned retention periods. Any database capable of storing de-duplicated data can be adapted to implement embodiments of the invention. One embodiment utilizes codified algorithms to cross-match one or more retention policies with each document instance in the system based on the role of the custodian held at the time the instance was created and retained and/or the content or category of the content to which the document refers. If multiple policies apply, the retention period is set to the furthest in the future.
-
Citations
12 Claims
-
1. A method for setting and managing a retention periods for documents in a distributed computing system, said method comprising the steps of:
-
loading compound documents into a database as individual items; extracting text from one or more of the individual items and linking the text to a source item; establishing a custodian for each document instance from the one or more documents; categorizing content within the one or more documents into one or more categories and associating the one or more documents with the one or more categories; assigning retention policies to the content based on the one or more categories, wherein the policies specify a retention period; and continuously monitoring retention policies according to assigned retention periods; utilizing algorithms to cross-match one or more retention policies with the each document instance based on the role of the custodian held at the time a document instance was created and retained and the category to which the document instance refers, in the event multiple retention policies apply for the each document instance, setting a principle retention period to the furthest in the future; and monitoring the principle retention period for said document instance by placing document instances that are expired or about to expire in a queue for deletion, modification, or suspension, and removing from the queue any document instance having a modified or suspended retention period and deleting any remaining document instance in the queue. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A distributed computing system, comprising:
-
a plurality of data sources storing a plurality of documents in said distributed computing system, wherein said plurality of data sources include backup storage media; a database system configured to extract said plurality of documents from said plurality of data sources and de-duplicate said plurality of documents into document instances; and at least one computer-readable medium carrying computer-executable program instructions comprising; code for associating a document instance in said computing system with at least one category in said computing system, and with at least one custodian, wherein said category has an associated first retention policy that specifies a first retention period therefore, and wherein said custodian has an associated second retention policy that specifies a second retention period therefore; code for in response to said category and said custodian associated with said document instance, associating said document instance with said first and second retention policies; code for;
setting a principle retention period for said document instance that has the longest retention policy and designating a policy that has the longest retention policy as the principle policy and designating all other associated policies as secondary policies for the document instance; andcode for placing document instances that are expired or about to expire in a queue for deletion, modification, or suspension; code for removing from said queue any document instance having a modified or suspended retention period; and code for deleting any remaining document instance in said queue. - View Dependent Claims (9, 10, 11)
-
-
12. A method for setting a retention period for data in a distributed computing system comprising a processor and one or more associated databases with the data, the method comprising the steps of:
-
loading a plurality of select data instances into one or more databases in a computing system; associating a select data instance in the computing system with at least one category in the computing system; assigning a code for associating the select data instance with at least one of a custodian and a group of custodians, where the custodian and the group of custodians is based on at least one of the following criteria;
authorship, ownership, and user-defined criteria of the select data instance, wherein retention policies are associated with the custodian and the groups of custodians;associating the select data instance with one or more of the retention policies based on the custodian and the groups of custodians and the at least one category; and reviewing the select data instance and designating a policy with the longest retention period as the principle policy and all other associated policies as secondary policies for the select data instance; placing select data instances that are expired or about to expire in a queue for deletion, modification, or suspension; removing from the queue any select data instance having a modified or suspended retention period; and deleting any select data instance in the queue.
-
Specification