×

DATA PROCESSING METHOD AND APPARATUS IN CLUSTER SYSTEM

  • US 20140201169A1
  • Filed: 12/24/2013
  • Published: 07/17/2014
  • Est. Priority Date: 12/12/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method of data de-duplication performed by a first processing node in storage system having a plurality of processing nodes each maintaining multiple data containers for storing de-duplicated data chunks, comprising:

  • receiving a data stream to be stored after de-duplication;

    dividing a segment of the data stream into a plurality of super-chunks, each super-chunk including multiple data chucks;

    deriving a first super-chuck identification (SID) for a super-chunk of the segment;

    identifying a second processing node of the storage system that corresponds to the first SID;

    querying the second processing node for a first data container that corresponds to the first SCID, wherein the first data container is maintained by a third processing node of the storage system;

    obtaining fingerprints of data chucks stored in the first data container that corresponds to the first SCID;

    based on a comparison between fingerprints of data chunks in the super-chunk and the obtained fingerprints to identify new data chucks whose signatures are not found in the obtained fingerprints; and

    storing the new data chucks in a local buffer of the first processing node.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×