×

Techniques for global single instance segment-based indexing for backup data

  • US 8,214,376 B1
  • Filed: 12/31/2007
  • Issued: 07/03/2012
  • Est. Priority Date: 12/31/2007
  • Status: Active Grant
First Claim
Patent Images

1. A method for performing global single instance segment-based indexing for backup data comprising:

  • parsing a non-executable data item being backed up to detect one or more syntactic boundaries within the non-executable data item being backed up between adjacent sections;

    dividing, using at least one computer processor, the non-executable data item being backed up into segments using the one or more detected syntactic boundaries, wherein dividing the non-executable data item being backed up into segments comprises padding at least one of the segments to make the at least one of the segments syntactically correct by completing the at least one of the segments according to a type of file to be indexed, and wherein the at least one of the segments comprises at least one of;

    an XML node, a sentence, a paragraph, and a page, wherein segmentation is performed for a plurality of different types of syntactical boundaries including paragraphs and at least one of;

    an XML node, a sentence, and a page and wherein padding is based at least in part on a format, received from an index engine, for a type of file to be indexed;

    generating a fingerprint for each segment; and

    saving an entry for each segment in an index database, wherein each entry comprises a resource list and the fingerprint for the segment, the resource list comprising a resource name and a reference count, wherein the reference count is configured to allow counting of a plurality of references to the resource name.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×