Method for the organizational indexing, storage, and retrieval of data according to data pattern signatures
First Claim
1. A computerized method for organizational indexing, storage, and retrieval of computerized representations of events in the form of data, by creating signatures based upon the occurrence of patterns within the data, said method comprising the steps of:
- a) reading a data entry record group representing an event, from a data stream, wherein said data entry record group comprises one or more data entry records wherein each of said data entry records has a key field containing an item;
b) tallying a key number representing a total number of said key fields in said data entry record group;
c) recording a marker to allow locating and determining the size of said data entry record group within said data stream;
d) creating an item indicator, wherein said item indicator represents a fixed length coded equivalence of said item;
e) repeating step d for each of said items of said data entry record group;
f) creating a signature, wherein said signature represents a fixed length coded equivalence of said item indicators of said data entry record group;
g) determining whether said signature has been previously created, and if said signature has not been previously created, then creating a partial index record comprising said signature and said key number;
h) updating a combination cross-reference file, wherein said combination cross-reference file comprises said signature of said data entry record group and said item indicators of said data entry record group;
i) repeating step h for each of said item indicators of said data entry record group; and
j) updating a raw data index, wherein said raw data index comprises said signature of said data entry record group, and said marker.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer control program directs a method whereby a computer reads a data entry record group from a data stream, wherein the data entry record group comprises one or more data entry records containing a key field containing an item. A key number representing the total number of key fields in the data entry record group is tallied, and a marker that allows for locating, and determining the size of, the data entry record group within the stream of data is recorded. The computer converts each item of the data entry record group to fixed length coded equivalence called an item indicator, and converts the item indicators to a fixed length coded equivalence called a signature. The computer creates a partial index record for each signature, and updates a combination cross-reference file and a raw data index, thereby classifying the data to allow for subsequent searching by persistent fast search methods.
-
Citations
16 Claims
-
1. A computerized method for organizational indexing, storage, and retrieval of computerized representations of events in the form of data, by creating signatures based upon the occurrence of patterns within the data, said method comprising the steps of:
-
a) reading a data entry record group representing an event, from a data stream, wherein said data entry record group comprises one or more data entry records wherein each of said data entry records has a key field containing an item; b) tallying a key number representing a total number of said key fields in said data entry record group; c) recording a marker to allow locating and determining the size of said data entry record group within said data stream; d) creating an item indicator, wherein said item indicator represents a fixed length coded equivalence of said item; e) repeating step d for each of said items of said data entry record group; f) creating a signature, wherein said signature represents a fixed length coded equivalence of said item indicators of said data entry record group; g) determining whether said signature has been previously created, and if said signature has not been previously created, then creating a partial index record comprising said signature and said key number; h) updating a combination cross-reference file, wherein said combination cross-reference file comprises said signature of said data entry record group and said item indicators of said data entry record group; i) repeating step h for each of said item indicators of said data entry record group; and j) updating a raw data index, wherein said raw data index comprises said signature of said data entry record group, and said marker. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A computerized method for organizational indexing, storage, and retrieval of computerized representations of events in the form of data, by creating signatures based upon the occurrence of patterns within the data, said method comprising the steps of:
-
a) reading a data entry record group representing an event from a data stream, wherein said data entry record group comprises one or more data entry records wherein each of said data entry records has a key field containing an item; b) tallying a key number representing a total number of said key fields in said data entry record group; c) sorting said key fields of said data entry record group according to a predetermined sequence, deleting any duplicate key fields, and reducing said key number by the number of said duplicates deleted; d) recording a marker to allow locating and determining the size of said data entry record group within said data stream; e) creating an item indicator, wherein said item indicator represents a fixed length coded equivalence of said item by applying a minimal perfect hash function to said item; f) repeating step d for each of said items of said data entry record group; g) creating a signature, wherein said signature represents a fixed length coded equivalence of said item indicators of said data entry record group by applying a signature function to said item indicators; h) determining whether said signature has been previously created, and if said signature has not been previously created, then creating a partial index record comprising said signature, said key number, a pointer to a combination cross-reference file, and a count representing the total number of data entry record groups with said signature, if said signature has been previously created then updating said count; i) updating said combination cross-reference file, wherein said combination cross-reference file comprises said signature of said data entry record group and an item indicator of said data entry record group; j) repeating step h for each of said item indicators of said data entry record group; k) updating a raw data index, wherein said raw data index comprises said signature of said data entry record group, and said marker; l) making said signature and a location of said signature in said partial index record available for use by a fast search method; and m) repeating steps a through l for a plurality of data entry record groups.
-
-
16. A computer readable memory for organizational indexing, storage, and retrieval of computerized representations of events in the form of data, by creating signatures based upon the occurrence of patterns within the data, said memory comprising:
-
a) a computer controlled program means for reading a data entry record group representing an event from a data stream, wherein said data entry record group comprises one or more data entry records wherein each of said data entry records has a key field containing an item; b) a computer controlled program means for tallying a key number representing a total number of said key fields in said data entry record group; c) a computer controlled program means for sorting said key fields of said data entry record group according to a predetermined sequence, deleting any duplicate key fields, and reducing said key number by the number of said duplicates deleted; d) a computer controlled program means for recording a marker to allow locating and determining the size of said data entry record group within said data stream; e) a computer controlled program means for creating an item indicator, wherein said item indicator represents a fixed length coded equivalence of said item by applying a minimal perfect hash function to said item; f) a computer controlled program means for repeating step d for each of said items of said data entry record group; g) a computer controlled program means for creating a signature, wherein said signature represents a fixed length coded equivalence of said item indicators of said data entry record group by applying a signature function to said item indicators; h) a computer controlled program means for determining whether said signature has been previously created, and if said signature has not been previously created, then creating a partial index record comprising said signature, said key number, a pointer to a combination cross-reference file, and a count representing the total number of data entry record groups with said signature, if said signature has been previously created then updating said count; i) a computer controlled program means for updating said combination cross-reference file, wherein said combination cross-reference file comprises said signature of said data entry record group and an item indicator of said data entry record group; j) a computer controlled program means for repeating step h for each of said item indicators of said data entry record group; k) a computer controlled program means for updating a raw data index, wherein said raw data index comprises said signature of said data entry record group, and said marker; l) a computer controlled program means for making said signature and a location of said signature in said partial index record available for use by a fast search method; and m) a computer controlled program means for repeating steps a through l for a plurality of data entry record groups.
-
Specification