Fast ingest, archive and retrieval systems, method and computer programs

US 8,788,464 B1
Filed: 07/25/2005
Issued: 07/22/2014
Est. Priority Date: 07/25/2005
Status: Active Grant

First Claim

Patent Images

1. An apparatus comprising:

a data generator for generating data to be archived;

a pre-archive filter for partitioning the data using predetermined queryable fields and for creating metadata relating to the data;

a short term storage device for storing the partitioned data in partitioned data files only until archiving of the data from the short term storage device to a long term storage is completed;

a database for storing the metadata in a metadata catalog, wherein the partitioned data files in conjunction with the metadata stored in the metadata catalog provides for persistent queryable files;

a query client for entering a query regarding the partitioned data files;

a query server for processing the query for querying against data stored in the short term storage device and the long term storage in the same way such that the data is always in a queryable state through the rest of the data'"'"'s lifecycle from the partitioning, the long term storage configured to store data archived from the short term storage device;

an archiver for processing the partitioned data files to store them in the long term storage and for updating the metadata catalog to reference the archived data files; and

an inference engine for correlating a first data and a second data from the data to generate a first relationship pair, correlating the second data and a third data from the data to generate a second relationship pair, and correlating the first relationship pair and the second relationship pair to generate a third relationship pair correlating the first data and the third data, the inference engine operating in parallel with the pre-archive filter,wherein in the correlating, the inference engine performs n degrees of correlation to correlate multiple dimensions of one-to-one or one-to-many inference metadata to produce inference relationships,wherein the data is queryable by the query server during the correlating, andwherein queryable includes user based searching using keywords.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems, processing methods and computer programs that rapidly ingest, archive and dynamically query the data to retrieve it from short and long term storage devices are disclosed. Data is partitioned on queryable fields and metadata relating to the partitioned data is stored in a database. This allows for data to be stored in a persistent queryable state, providing query transparency irrespective of the location that the data is actually stored. Software code with differing functionality that shares consistent data structures and methods is used in components of the system to provide flexibility and speed.

33 Citations

View as Search Results

20 Claims

1. An apparatus comprising:
- a data generator for generating data to be archived;
  
  a pre-archive filter for partitioning the data using predetermined queryable fields and for creating metadata relating to the data;
  
  a short term storage device for storing the partitioned data in partitioned data files only until archiving of the data from the short term storage device to a long term storage is completed;
  
  a database for storing the metadata in a metadata catalog, wherein the partitioned data files in conjunction with the metadata stored in the metadata catalog provides for persistent queryable files;
  
  a query client for entering a query regarding the partitioned data files;
  
  a query server for processing the query for querying against data stored in the short term storage device and the long term storage in the same way such that the data is always in a queryable state through the rest of the data'"'"'s lifecycle from the partitioning, the long term storage configured to store data archived from the short term storage device;
  
  an archiver for processing the partitioned data files to store them in the long term storage and for updating the metadata catalog to reference the archived data files; and
  
  an inference engine for correlating a first data and a second data from the data to generate a first relationship pair, correlating the second data and a third data from the data to generate a second relationship pair, and correlating the first relationship pair and the second relationship pair to generate a third relationship pair correlating the first data and the third data, the inference engine operating in parallel with the pre-archive filter,wherein in the correlating, the inference engine performs n degrees of correlation to correlate multiple dimensions of one-to-one or one-to-many inference metadata to produce inference relationships,wherein the data is queryable by the query server during the correlating, andwherein queryable includes user based searching using keywords.
- View Dependent Claims (2, 3, 4, 5, 19, 20)
- - 2. The apparatus recited in claim 1, the inference engine correlating the third relationship pair and a fourth relationship pair correlating the third data and a fourth data from the data to generate a fifth relationship pair correlating the first data and the fourth data.
  - 3. The apparatus recited in claim 2 wherein the inference metadata are instantiated as data files having a logical file and record identifiers that are the same as the partitioned data files.
  - 4. The apparatus recited in claim 1 wherein data derived from an interface control document that defines formats for data types and code foundation data are processed to automatically generate an extensible markup language (XML) file and generate source code having differing functionality that shares consistent data structures and methods using the XML file to instantiate the data generator, archiver, query server, and query client.
  - 5. The apparatus recited in claim 4 wherein the XML file comprises an XML representation of simple high rate data structures and complex hierarchical data structures, and wherein the code foundation data comprise template files containing segments of source code mixed with tokens.
  - 19. The apparatus recited in claim 2 wherein the inference metadata are instantiated as data files having:
    - the same logical file identifier as the partitioned data files, but with a different file name based upon a data type, andthe same record identifiers as the partitioned data files.
  - 20. The apparatus recited in claim 1 wherein the first relationship pair, the second relationship pair, and the third relationship pair are metadata relationships.

6. A data processing method comprising:
- ingesting data for archiving;
  
  partitioning by a pre-archive filter the ingested data using predetermined queryable fields;
  
  storing the partitioned data in a short term storage device only until archiving of the data from the short term storage device to a long term storage is completed;
  
  processing the ingested data to create metadata relating to the ingested data;
  
  storing the metadata in a metadata catalog of a database;
  
  processing a query for querying against data stored in the short term storage device and the long term storage in the same way such that the data is always in a queryable state through the rest of the data'"'"'s lifecycle from the partitioning;
  
  processing the partitioned data to store it in the long term storage;
  
  updating the metadata catalog to reference the archived data; and
  
  correlating by an inference engine a first data and a second data from the data to generate a first relationship pair, correlating the second data and a third data from the data to generate a second relationship pair, and correlating the first relationship pair and the second relationship pair to generate a third relationship pair correlating the first data and the third data, the inference engine operating in parallel with the pre-archive filter,wherein in the correlating, n degrees of correlation are used to correlate multiple dimensions of one-to-one or one-to-many inference metadata to produce inference relationships,wherein the data is queryable during the correlating, andwherein queryable includes user based searching using keywords.
- View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
- - 7. The method recited in claim 6, wherein correlating by the inference engine includes correlating the third relationship pair and a fourth relationship pair correlating the third data and a fourth data from the data to generate a fifth relationship pair correlating the first data and the fourth data.
  - 8. The method recited in claim 6 wherein data derived from an interface control document that defines formats for data types and code foundation data are processed to automatically generate an extensible markup language (XML) file and generate source code having differing functionality that shares consistent data structures and methods using the XML file.
  - 9. The method recited in claim 8 wherein the XML file comprises an XML representation of simple high rate data structures and complex hierarchical data structures, and wherein the code foundation data comprise template files containing segments of source code mixed with tokens.
  - 10. The method recited in claim 9 wherein, for each structure defined in the XML file, a template file is assembled and the tokens are replaced with text, numbers, or complete algorithms to generate the source code that supports all data structures described in the XML file.
  - 11. The method recited in claim 6 wherein the inference relationships are transformed to a persistent queryable state and are instantiated as a data file with a same logical file identifier and different file name based on their data type and record identifiers as an original file, along with additional inference relationship attributes.
  - 12. The method recited in claim 6 wherein once the inference relationships are transformed to a persistent queryable state, they can be joined with data from a matching file using a file identifier and record identifiers to connect the inference relationship with a data record.
  - 13. The method recited in claim 6 wherein the inference engine preprocesses relationship-pairs, sorts the relationship-pairs, merges the relationship-pairs with uncorrelated relationship-pairs from a previous correlation, removes expired relationship-pairs, and correlates relationship-pairs to produce a correlated relationship-pair.

14. A non-transitory computer-readable medium comprising:
- a first code segment for ingesting data for archiving;
  
  a second code segment for partitioning the ingested data using predetermined queryable fields;
  
  a third code segment for storing the partitioned data in a short term storage device only until archiving of the data from the short term storage device to a long term storage is completed;
  
  a fourth code segment for processing the ingested data to create metadata relating to the ingested data;
  
  a fifth code segment for processing a query for querying against data stored in the short term storage device and the long term storage in the same way such that the data is always in a queryable state through the rest of the data'"'"'s lifecycle from the partitioning;
  
  a sixth code segment for storing data archived from the short term storage device to the long term storage;
  
  a seventh code segment for updating a metadata catalog to reference the archived data; and
  
  an eighth code segment for correlating a first data and a second data from the data to generate a first relationship pair, correlating the second data and a third data from the data to generate a second relationship pair, and correlating the first relationship pair and the second relationship pair to generate a third relationship pair correlating the first data and the third data, the eighth code segment operating in parallel with the second code segment,wherein in the correlating, n degrees of correlation are used to correlate multiple dimensions of one-to-one or one-to-many inference metadata to produce inference relationships,wherein the data is queryable during the correlating, andwherein queryable includes user based searching using keywords.
- View Dependent Claims (15)
- - 15. The non-transitory computer-readable medium according to claim 14 further comprising:
    - a ninth code segment for storing the inference metadata in the metadata catalog; and
      
      a tenth code segment for storing the inference metadata in an archive,wherein the eighth code segment for correlating further includes a code segment for correlating the third relationship pair and a fourth relationship pair correlating the third data and a fourth data from the data to generate a fifth relationship pair correlating the first data and the fourth data.

16. A non-transitory computer-readable medium for generating code, comprising:
- a first code segment for processing data derived from an interface control document that defines formats for data types and code foundation data to automatically generate an extensible markup language (XML) file and generate source code having differing functionality that shares consistent data structures and methods using the XML file;
  
  a second code segment for partitioning the processed data using predetermined queryable fields;
  
  a third code segment for storing partitioned data in a short term storage device only until archiving of the data from the short term storage device to a long term storage is completed;
  
  a fourth code segment for processing a query for querying against data stored in the short term storage device and the long term storage in the same way such that the data is always in a queryable state through the rest of the data'"'"'s lifecycle from the partitioning;
  
  a fifth code segment for storing data archived from the short term storage device to the long term storage; and
  
  a sixth code segment for correlating a first data and a second data from the data to generate a first relationship pair, correlating the second data and a third data from the data to generate a second relationship pair, and correlating the first relationship pair and the second relationship pair to generate a third relationship pair correlating the first data and the third data, the sixth code segment operating in parallel with the second code segment,wherein in the correlating, n degrees of correlation are used to correlate multiple dimensions of one-to-one or one-to-many inference metadata to produce inference relationships,wherein the data is queryable during the correlating, andwherein queryable includes user based searching using keywords.
- View Dependent Claims (17, 18)
- - 17. The non-transitory computer-readable medium according to claim 16 wherein the XML file comprises an XML representation of simple high rate data structures and complex hierarchical data structures, and wherein the code foundation data comprise template files containing segments of source code mixed with tokens.
  - 18. The non-transitory computer-readable medium according to claim 17 wherein, for each structure defined in the XML file, a template file is assembled and the tokens are replaced with text, numbers, or complete algorithms to generate the source code that supports all data structures described in the XML file.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Leidos Innovations Technology, Inc.
Original Assignee
Lockheed Martin Corporation (Martin Marietta Corporation)
Inventors
Lola, Geminiano A., Allocco, Christopher M., Chu, Becky S., Oberlander, Nerissa K., Rudisill, Robert S., Tran, Benjamin R.
Primary Examiner(s)
Bhatia, Ajay
Assistant Examiner(s)
HUANG, MIRANDA M

Application Number

US11/188,942
Time in Patent Office

3,284 Days
Field of Search

707/667
US Class Current

707/667
CPC Class Codes

G06F 16/278 Data partitioning, e.g. hor...

Fast ingest, archive and retrieval systems, method and computer programs

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

33 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Fast ingest, archive and retrieval systems, method and computer programs

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

33 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others