×

System and method for chunk-based indexing of file system content

  • US 7,487,138 B2
  • Filed: 08/25/2004
  • Issued: 02/03/2009
  • Est. Priority Date: 08/25/2004
  • Status: Active Grant
First Claim
Patent Images

1. A system, comprising:

  • a storage device configured to store data; and

    a file system configured to manage access to said storage device and to store file system content including a plurality of files to said storage device; and

    a search engine configured to construct an index of said file system content;

    wherein said file system is further configured to partition a given one of said plurality of files into a plurality of logical chunks, wherein given ones of said logical chunks include structured data records formatted according to a self-describing data format, wherein each of said structured data records includes one or more data elements delimited by respective tag fields, wherein said tag fields are defined according to said self-describing data format;

    wherein to partition said given file, said file system is further configured to adjust a chunk boundary between two adjacent given ones of said logical chunks such that said chunk boundary falls between boundaries of said structured data records;

    wherein to construct said index, said search engine is further configured to generate respective index information associated with each of said plurality of logical chunks, such that boundaries of said respective index information correspond to boundaries of said logical chunks;

    wherein for each given one of said plurality of logical chunks, said respective index information is indicative of one or more data patterns occurring within said given logical chunk of said given file; and

    wherein in response to detecting an operation to modify said given file, said file system is further configured to identify one or more modified logical chunks of said given file, and wherein said search engine is further configured to regenerate respective index information associated with each of said one or more modified logical chunks without regenerating respective index information for one or more logical chunks of said given file that are unmodified by said operation.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×