Systems and/or methods for distributed data archiving amongst a plurality of networked computing devices

US 9,002,801 B2
Filed: 03/29/2010
Issued: 04/07/2015
Est. Priority Date: 03/29/2010
Status: Active Grant

First Claim

Patent Images

1. An archival system, comprising:

a plurality of computers connected to a network;

at least one source data store and at least one target data store connected to the network; and

at least one archive service configured to coordinate a plurality of extract operations with a plurality of accumulate operations, each said extract operation being executed on one said computer in the plurality of computers to read data from one said source data store and each said accumulate operation being executed on one said computer in the plurality of computers to write data to one said target data store,wherein each said extract operation is configured to run on a first computer in the plurality of computers and is paired with one said accumulate operation that is configured to run on a second separately located computer, different from the first computer, in the plurality of computers,wherein the at least one archive service is further configured to implement extract rules identifying data to be read from the at least one source data store and identifying whether a schema for the at least one source data store is to be attached to the data to be read, andwherein the at least one archive service is further configured to coordinate at least one validation operation to repeatedly verify integrity of data stored in the at least one target data store, and when the validation operation(s) determine(s) that data stored in the at least one target data store lost integrity, to perform an operation on the data with lost integrity based on rules associated with the data.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Certain example embodiments of this invention relate to system and/or methods that pair a data extractor with a data accumulator, wherein these components may be located on any one or more computers in a network system. This distributed peer extract-accumulate approach is advantageous in that it reduces (and sometimes completely eliminates) the need for a “funnel” approach to data archiving, wherein all data is moved or backed up through a central computer or central computer system. In certain example embodiments, recall-accumulate, search, verify, and/or other archive-related activities may be performed in a similar peer-based and/or distributed manner. Certain example embodiments may in addition or in the alternative incorporate techniques for verifying the integrity of data in an archive system, and/or techniques for restoring/importing data from a non-consumable form.

Citations

37 Claims

1. An archival system, comprising:
- a plurality of computers connected to a network;
  
  at least one source data store and at least one target data store connected to the network; and
  
  at least one archive service configured to coordinate a plurality of extract operations with a plurality of accumulate operations, each said extract operation being executed on one said computer in the plurality of computers to read data from one said source data store and each said accumulate operation being executed on one said computer in the plurality of computers to write data to one said target data store,wherein each said extract operation is configured to run on a first computer in the plurality of computers and is paired with one said accumulate operation that is configured to run on a second separately located computer, different from the first computer, in the plurality of computers,wherein the at least one archive service is further configured to implement extract rules identifying data to be read from the at least one source data store and identifying whether a schema for the at least one source data store is to be attached to the data to be read, andwherein the at least one archive service is further configured to coordinate at least one validation operation to repeatedly verify integrity of data stored in the at least one target data store, and when the validation operation(s) determine(s) that data stored in the at least one target data store lost integrity, to perform an operation on the data with lost integrity based on rules associated with the data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 28, 29, 30, 31, 36, 37)
- - 2. The archival system of claim 1, wherein the at least one archive service is further configured to access rules (a) mapping the extract and accumulate operations to respective computers on which the operations are to be executed, and (b) storing pairings between peer extract and accumulate operations.
  - 3. The archival system of claim 2, wherein the rules further identify at least one said source data store for each said extract operation and at least one said target data store for each said accumulate operation.
  - 4. The archival system of claim 3, wherein the rules further include accumulate rules indicating how duplicate entries are to be handled and/or how long data is to be retained in each said target data store.
  - 5. The archival system of claim 3, further comprising a configuration server in communication with the at least one archive service, the configuration server including the rules.
  - 6. The archival system of claim 5, further comprising an administrative user interface in communication with the configuration server and/or at least one archive service for setting the rules and/or monitoring progress of the operations.
  - 7. The archival system of claim 3, wherein one said target data store is a vault that is accessible only in part by each said computer.
  - 8. The archival system of claim 3, wherein the system is arranged such that a central hub or funnel is not used when data is written to the at least one target data store.
  - 9. The archival system of claim 3, wherein the at least one archive service is further configured to coordinate at least one recall operation to retrieve data from the at least one target data store for access on a computer of said plurality of computers.
  - 10. The archival system of claim 9, wherein the at least one recall operation is further configured to place the retrieved data from the at least one target data store into at least one said source data store.
  - 11. The archival system of claim 1, wherein the at least one validation operation is configured to run continuously for a predetermined amount of time.
  - 12. The archival system of claim 1, wherein the at least one validation operation is configured to run periodically such that the at least one validation operation begins execution at predetermined times or time intervals.
  - 13. The archival system of claim 1, wherein the at least one validation operation is configured to determine whether data exists, matches a checksum, and/or is recallable, based on predetermined rules accessible by the at least one validation operation.
  - 14. The archival system of claim 13, wherein the at least one validation operation is further configured to raise an alarm if the at least one validation operation encounters a fault during execution.
  - 15. The archival system of claim 3, wherein the at least one archive service is further configured to coordinate at least one importing operation to incorporate data from an otherwise non-consumable backup location into the at least one source data store and/or the at least one target data store.
  - 16. The archival system of claim 15, wherein the at least one importing operation is configured to determine (a) what data exists in the data from the otherwise non-consumable backup location was backed up, and (b) how the data from the otherwise non-consumable backup location was backed up.
  - 17. The archival system of claim 16, wherein the at least one importing operation implements rules for reacquiring the data from the otherwise non-consumable backup location, the rules for reacquiring the data being one or more user programmed rules, one or more predefined algorithms, and/or automatically generated by the at least one importing operation.
  - 28. The method of claim 1, wherein the at least one validation operation is configurable to run continuously for a predetermined amount of time.
  - 29. The method of claim 1, wherein the at least one validation operation is configurable to run periodically such that the at least one validation operation begins execution at predetermined times or time intervals.
  - 30. The method of claim 1, wherein the at least one validation operation is configurable to determine whether data exists, matches a checksum, and/or is recallable, based on predetermined rules accessible by the at least one validation operation.
  - 31. The method of claim 30, wherein the at least one validation operation is further configurable to raise an alarm if the at least one validation operation encounters a fault during execution.
  - 36. The archival system of claim 1, wherein the first computer and the second computer are components of a distributed computing system comprising the plurality of computers.
  - 37. The archival system of claim 1, wherein the at least one archive service is configured so that data having a higher archive priority will be distributed to higher performance devices where data of a lower archive priority will be distributed to lower performance devices.

18. A method implemented using a plurality of computers connected to a network for storing data in an archival system, the method comprising:
- enabling at least one source data store and at least one target data store connectivity to the network;
  
  enabling at least one archive service on one or more of said computers, the at least one archive service being configured to interface a plurality of extract operations with a plurality of accumulate operations, each said extract operation being configured to run on a first computer in the plurality of computers, and to coordinate at least one validation operation to repeatedly verify integrity of data stored in the at least one target data store;
  
  pairing each said extract operation with one said accumulate operation, each said accumulate operation being configured to run on a second separately located computer, different from the first computer, in the plurality of computers;
  
  implementing defined extract rules (a) identifying data to be read from the at least one source data store and (b) identifying whether a schema for the at least one source data store is to be attached to the data to be read,wherein each said extract operation is executable on one said computer in the plurality of computers to read data from one said source data store and each said accumulate operation is executable on one said computer in the plurality of computers to write data to one said target data store; and
  
  when the validation operation(s) determine(s) that data stored in the at least one target data store lost integrity, performing an operation on the data with lost integrity based on rules associated with the data.
- View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 32, 33, 34)
- - 19. The method of claim 18, further comprising:
    - mapping the extract and accumulate operations to respective computers on which the operations are to be executed; and
      
      storing pairings between peer extract and accumulate operations.
  - 20. The method of claim 19, further comprising identifying at least one said source data store for each said extract operation and at least one said target data store for each said accumulate operation.
  - 21. The method of claim 20, further comprising defining accumulate rules indicating (a) how duplicate entries are to be handled and/or (b) how long data is to be retained in each said target data store.
  - 22. The method of claim 20, further comprising providing a configuration server in communication with the at least one archive service, the configuration server having access to rules describing the mapping, the pairing, the at least one said source data store identified for each said extract operation, and the at least one said target data store identified for each said accumulate operation.
  - 23. The method of claim 22, further comprising providing an administrative user interface in communication with the configuration server and/or at least one archive service for setting the rules and/or monitoring progress of the operations.
  - 24. The method of claim 20, wherein one said target data store is a vault that is accessible only in part by each said computer.
  - 25. The method of claim 20, wherein the operations are performed without resort to a central hub or funnel.
  - 26. The method of claim 20, wherein the at least one archive service is further configured to coordinate at least one recall operation to retrieve data from the at least one target data store for access on a computer of said plurality of computers.
  - 27. The method of claim 26, wherein the at least one recall operation is further configured to place the retrieved data from the at least one target data store into at least one said source data store.
  - 32. The method of claim 20, wherein the at least one archive service is further configured to coordinate at least one importing operation to incorporate data from an otherwise non-consumable backup location into the at least one source data store and/or the at least one target data store.
  - 33. The method of claim 32, wherein the at least one importing operation is configurable to determine (a) what data exists in the data from the otherwise non-consumable backup location was backed up, and (b) how the data from the otherwise non-consumable backup location was backed up.
  - 34. The method of claim 33, wherein the at least one importing operation implements rules for reacquiring the data from the otherwise non-consumable backup location, the rules for reacquiring the data being one or more user programmed rules, one or more predefined algorithms, and/or automatically generated by the at least one importing operation.

35. A non-transitory computer readable storage medium storing instructions that, when executed, cause a computer including at least one processor to perform features comprising:
- enabling at least one source data store and at least one target data store connected to a network;
  
  enabling at least one archive service on one or more computers connected to the network, the at least one archive service being configured to interface a plurality of extract operations with a plurality of accumulate operations, each said extract operation being configured to run on a first computer in the plurality of computers, and to coordinate at least one validation operation to repeatedly verify integrity of data stored in the at least one target data store;
  
  pairing each said extract operation with one said accumulate operation, each said accumulate operation being configured to run on a second separately located computer, different from the first computer, in the plurality of computers; and
  
  implementing defined extract rules (a) identifying data to be read from the at least one source data store and (b) identifying whether a schema for the at least one source data store is to be attached to the data to be read,wherein each said extract operation is executable on one said computer in the plurality of computers to read data from one said source data store and each said accumulate operation is executable on one said computer in the plurality of computers to write data to one said target data store; and
  
  when the validation operation(s) determine(s) that data stored in the at least one target data store lost integrity, performing an operation on the data with lost integrity based on rules associated with the data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Software AG
Original Assignee
Software AG
Inventors
Meehan, Michael C.
Primary Examiner(s)
SYED, FARHAN M

Application Number

US12/662,026
Publication Number

US 20110238935A1
Time in Patent Office

1,835 Days
Field of Search
US Class Current

707/665
CPC Class Codes

G06F 11/1469   Backup restoration techniques

G06F 16/20   of structured data, e.g. re...

G06F 16/27   Replication, distribution o...

Systems and/or methods for distributed data archiving amongst a plurality of networked computing devices

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

37 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and/or methods for distributed data archiving amongst a plurality of networked computing devices

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

37 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links