×

Systems and methods for manipulation of inexact semi-structured data

  • US 8,224,830 B2
  • Filed: 03/17/2006
  • Issued: 07/17/2012
  • Est. Priority Date: 03/19/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for reducing a set of strings to approximately match to a first string by determining an edit distance between the first string and the set of strings is within a predetermined threshold, the method comprising:

  • (a) receiving, by a device, a request to approximately match a first string with a set of strings using a predetermined edit distance;

    (b) generating, by a device, a difference histogram comprising a distribution of a difference in a first number of occurrences of each character of a character set in the first string of the request and a second number of occurrences of each character of the character set in a second string of the set of strings, by incrementing each cell in the difference histogram corresponding to each character in the first string by a positive value and decrementing each cell in the difference histogram corresponding to each character set in the second string by a negative value;

    (c) determining, by a device, via the difference histogram that a first sum of values across a plurality of cells of the difference histogram is greater than a predetermined threshold and that a second sum of negative values across a second plurality of cells of the difference histogram is less than a negative of the predetermined threshold; and

    (d) identifying, by the device, the second string as having an edit distance from the first string greater than the predetermined edit distance in response to the determination.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×