Archive indexing engine
First Claim
1. A method comprising:
- identifying a data record for deletion from a database and storage in a data archive, the data record comprising a plurality of data record attributes, each of the plurality of data record attributes comprising a value that comprises at least one term;
creating an archive record that comprises a first subset of attribute values of the plurality of data record attributes and an index record that comprises a second subset of attribute values of the plurality of data record attributes;
storing the archive record in a data archive that is stored separately from the database;
adding a reference to a location of the archive record in the data archive to the new index record;
adding the new index record to a dictionary-based archive index that is stored separately from the database, the dictionary-based archive index comprising a plurality of index records and a dictionary, the adding of the index record to the dictionary-based archive index comprising identifying every term of the second subset of attribute values of the plurality of data record attributes and adding each of the terms to the dictionary except for those terms that are already in the dictionary, wherein at least one index record of the plurality of index records in the dictionary-based archive index comprises one term from terms stored in the dictionary, references to locations of multiple archive records in the data archive that contain the one term, and information regarding locations of the one term within the referenced multiple archive records, the location information including indications, for each of the multiple of archive records referenced by the at least one index record, respective attributes of the each of the multiple archive records in which the one term is found;
deleting the data record from the database;
determining whether an attribute value of the plurality of attribute values is required for frequent user read access; and
if the attribute value is not required for frequent user read access, deleting the attribute value from the dictionary-based archive index.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus, including computer program products, for archiving data from a database. One method includes identifying a data record to be archived; determining the contents of an archive record, the archive record having values for a first plurality of attributes in the data record; storing the archive record in a data archive; determining the contents of an index record, the index record comprising values for a second plurality of attributes in the data record; adding the index record to a dictionary-based archive index with a reference to the location of the archive record in the data archive; deleting the data record from the database; accepting a query for a desired archive record; and performing a search of the archive index to find the desired archive record.
78 Citations
22 Claims
-
1. A method comprising:
-
identifying a data record for deletion from a database and storage in a data archive, the data record comprising a plurality of data record attributes, each of the plurality of data record attributes comprising a value that comprises at least one term; creating an archive record that comprises a first subset of attribute values of the plurality of data record attributes and an index record that comprises a second subset of attribute values of the plurality of data record attributes; storing the archive record in a data archive that is stored separately from the database; adding a reference to a location of the archive record in the data archive to the new index record; adding the new index record to a dictionary-based archive index that is stored separately from the database, the dictionary-based archive index comprising a plurality of index records and a dictionary, the adding of the index record to the dictionary-based archive index comprising identifying every term of the second subset of attribute values of the plurality of data record attributes and adding each of the terms to the dictionary except for those terms that are already in the dictionary, wherein at least one index record of the plurality of index records in the dictionary-based archive index comprises one term from terms stored in the dictionary, references to locations of multiple archive records in the data archive that contain the one term, and information regarding locations of the one term within the referenced multiple archive records, the location information including indications, for each of the multiple of archive records referenced by the at least one index record, respective attributes of the each of the multiple archive records in which the one term is found; deleting the data record from the database; determining whether an attribute value of the plurality of attribute values is required for frequent user read access; and if the attribute value is not required for frequent user read access, deleting the attribute value from the dictionary-based archive index. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product, encoded in an information carrier, operable to cause data processing apparatus to perform operations comprising:
-
identifying a data record to be archived, the data record comprising a plurality of data record attributes and originally residing in a database; creating an archive record, the archive record comprising a first subset of the plurality of data record attributes, the first subset comprising at least some of the plurality of data record attributes; storing the archive record in a data archive, the data archive being maintained separately from the database; creating a new archive index record, the new archive index record comprising a reference to a location of the archive record in the data archive and a second plurality of attributes of the plurality of data record attributes, the second subset comprising selected attributes from the plurality of data record attributes, the selected attributes being those identified as necessary for access by users of the database; adding the new archive index record to a dictionary-based archive index, the dictionary-based archive index comprising a plurality of archive index records, being stored separately from the database, and comprising a dictionary storing every term used in the plurality of index records; deleting the data record from the database; determining whether an attribute value of the plurality of attribute values is required for frequent user read access; and if the attribute value is not required for frequent user read access, deleting the attribute value from the dictionary-based archive index. - View Dependent Claims (9, 10, 11, 12, 21)
-
-
13. A system comprising:
-
at least one processor; and an information carrier storing instructions that, when executed by the at least one processor, cause the at least one processor to perform operations comprising; identifying a data record to be archived, the data record comprising a plurality of data record attributes and originally residing in a database; creating an archive record, the archive record comprising a first subset of the plurality of data record attributes, the first subset comprising at least some of the plurality of data record attributes; storing the archive record in a data archive that is separate from the database; creating a new archive index record, the new archive index record comprising a reference to a location of the archive record in the data archive and a second plurality of attribute of the plurality of data record attributes, the second subset comprising selected attributes from the plurality of data record attributes, the selected attributes being those identified as necessary for access by users of the database; adding the new archive index record to a dictionary-based archive index, the dictionary-based archive index comprising a plurality of archive index records, being stored separately from the database, and comprising a dictionary storing every term used in the plurality of index records; deleting the data record from the database; and deleting, after a selected period, the values of all but a selected subset of the plurality of data record attributes from the dictionary-based archive index, the selected subset of the plurality of data record attributes comprising an index key to the archived record. - View Dependent Claims (14, 15, 16)
-
-
17. A computer program product, encoded in an information carrier, operable to cause data processing apparatus to perform operations comprising:
-
identifying a data record for deletion from a database and storage in a data archive, the data record comprising a plurality of data record attributes, each of the plurality of data record attributes comprising a value that comprises at least one term; creating an archive record that comprises a first subset of attribute values of the plurality of data record attributes and an index record that comprises a second subset of attribute values of the plurality of data record attributes, the second subset including all of the data record attributes; storing the archive record in a data archive that is stored separately from the database; adding a reference to a location of the archive record in the data archive to the new index record; adding the new index record to a dictionary-based archive index that is stored separately, from the database, the dictionary-based archive index comprising a plurality of index records and a dictionary, the adding of the index record to the dictionary-based archive index comprising identifying every term of the second subset of attribute values of the plurality of data record attributes and adding each of the terms to the dictionary except for those terms that are already in the dictionary, wherein at least one index record of the plurality of index records in the dictionary-based archive index comprises one term from terms stored in the dictionary, references to locations of multiple archive records in the data archive that contain the one term, and information regarding locations of the one term within the referenced multiple archive records, the location information including indications, for each of the multiple of archive records referenced by the at least one index record, respective attributes of the each of the multiple archive records in which the one term is found; deleting the data record from the database; and deleting, after a selected period, the values of all but a selected subset of the plurality of data record attributes from the dictionary-based archive index, the selected subset of the plurality of data record attributes comprising an index key to the archived record.
-
-
18. A method comprising:
-
identifying a data record to be archived, the data record comprising a plurality of data record attributes and originally residing in a database; creating an archive record, the archive record comprising a first subset of the plurality of data record attributes, the first subset comprising at least some of the plurality of data record attributes; storing the archive record in a data archive, the data archive being maintained separately from the database; creating a new archive index record, the new archive index record comprising a reference to a location of the archive record in the data archive and a second plurality of attributes of the plurality of data record attributes, the second subset comprising selected attributes from the plurality of data record attributes, the selected attributes being those identified as necessary for access by users of the database; adding the new archive index record to a dictionary-based archive index, the dictionary-based archive index comprising a plurality of archive index records, being stored separately from the database, and comprising a dictionary storing every term used in the plurality of index records; deleting the data record from the database; determining whether an attribute value of the plurality of attribute values is required for frequent user read access; and if the attribute value is not required for frequent user read access, deleting the attribute value from the dictionary-based archive index. - View Dependent Claims (19)
-
-
20. A method comprising:
-
identifying a data record to be archived, the data record comprising a plurality of data record attributes and originally residing in a database; creating an archive record, the archive record comprising a first subset of the plurality of data record attributes, the first subset comprising at least some of the plurality of data record attributes; storing the archive record in a data archive, the data archive being maintained separately from the database; creating a new archive index record, the new archive index record comprising a reference to a location of the archive record in the data archive and a second plurality of attributes of the plurality of data record attributes, the second subset comprising selected attributes from the plurality of data record attributes, the selected attributes being those identified as necessary for access by users of the database; adding the new archive index record to a dictionary-based archive index, the dictionary-based archive index comprising a plurality of archive index records, being stored separately from the database, and comprising a dictionary storing every term used in the plurality of index records; deleting the data record from the database; and applying statistical techniques or artificial intelligence to determine which data record attributes are required for frequent user read access and should therefore remain in the database, and which data record attributes should be deleted because of lower user read access frequencies.
-
-
22. A computer program product, encoded in an information carrier, operable to cause data processing apparatus to perform operations comprising:
-
identifying a data record to be archived, the data record comprising a plurality of data record attributes and originally residing in a database; creating an archive record, the archive record comprising a first subset of the plurality of data record attributes, the first subset comprising at least some of the plurality of data record attributes; storing the archive record in a data archive, the data archive being maintained separately from the database; creating a new archive index record, the new archive index record comprising a reference to a location of the archive record in the data archive and a second plurality of attributes of the plurality of data record attributes, the second subset comprising selected attributes from the plurality of data record attributes, the selected attributes being those identified as necessary for access by users of the database; adding the new archive index record to a dictionary-based archive index, the dictionary-based archive index comprising a plurality of archive index records, being stored separately from the database, and comprising a dictionary storing every term used in the plurality of index records; deleting the data record from the database; and applying statistical techniques or artificial intelligence to determine which data record attributes are required for frequent user read access and should therefore remain in the database, and which data record attributes should be deleted because of lower user read access frequencies.
-
Specification