×

Segmenting a string using similarity values

  • US 8,081,823 B2
  • Filed: 11/20/2007
  • Issued: 12/20/2011
  • Est. Priority Date: 11/20/2007
  • Status: Active Grant
First Claim
Patent Images

1. A method for segmenting a string comprising one or more segments into discrete segments, wherein each of the one or more segments comprises data that is the same as or similar to a marker string, the method comprising:

  • generating a similarity vector comprising a plurality of similarity values and associated locations within the string wherein a similarity value represents a comparison of the marker string and at least a portion of the string and an associated location associated with the similarity value is the location within the string of the start of the at least a portion of the string used in the comparison;

    identifying a set of ideal segmentation locations based upon an expected number of discrete segments within the string;

    using the similarity vector to identify a set of candidate segmentation locations;

    responsive to a candidate segmentation location having a similarity value less than another candidate segmentation location within a local window, removing the candidate segmentation location from the set of candidate segmentation locations;

    responsive to a candidate segmentation location and a closest ideal segmentation location being at a distance that is greater than the distance threshold, removing the candidate segmentation location from the set of candidate segmentation locations; and

    using the set of candidate segmentation locations and the set of ideal segmentation locations to generate a set of segmentation locations; and

    using the set of segmentation locations to segment the string.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×